We're Hiring!  
Toll Free US & Canada: 1(877) 946-4743   Worldwide: +1(415) 869-7444

half-closed plane windowThe recent McKinsey reportClearing the air on cloud computing” has caused quite a bit of stir within the cloud community, and I can see why. While it definitely brings a good deal of analysis to the table, I feel it is somewhat generalized, makes assumptions and does overlook some key points.

First and foremost, this article is NOT going to be an analytical discussion of the cost of running or setting up a datacenter vs. an Amazon EC2 Windows instance. I’m not a financial analyst. Honestly, calculating the Total Cost of Assets (TCA) or Total Cost of Operations (TCO) causes my eyes to roll back into my head leaving me gasping for air. Don’t get me wrong, it seems like some good effort was made analyzing data and formulating conclusions. The problem is, I feel that they were on a jetliner, shooting through the clouds with the shades 1/2 down.

Before I start with my own analysis and commentary, I would like to reference a few responses I have read that somewhat chastise McKinsey.

Three “Rebuttal” Articles to Read

The first comes from CIO IT Drilldown’s Virtualization site. In his articleMcKinsey Cloud Computing Report Conclusions Don’t Add Up,” Bernard Golden does the major lifting for me in terms of analysis. I have highlighted some key points from the article that I viewed to be particularly important (my highlighted version of the article is here). I particularly enjoyed Golden’s rebuttal to the analysis of cost calculations, namely use of EC2 Windows instances, headcounts that don’t add up and other “less visible” capital expenses for facilities and other assets. Also as Golden points out, McKinsey proposes that better efficiencies and savings can be realized through virtualization within the organization. To me, the McKinsey recommendation seems a bit counter-intuitive: “Don’t go with a vendor whose expertise IS virtualization, hardware, infrastructure, et al. Instead, DO try to do it yourself, with tremendous CapEx & OpEx expense.” Hmmm, makes sense to me, NOT! Lastly, I particularly liked Golden’s 3 recommendations (quoted from article):

  1. Review your portfolio of applications to understand what cloud computing means to you.
  2. Create a viable financial model for assessing the true costs of internal hosting.
  3. Evaluate the potential for an internal cloud even if the numbers don’t work with an external cloud provider.

Another good article comes from the “Official Google Enterprise Blog”, posted by Rajen Sheth, Sr. Prod. Mgr for Google Apps. Titled “What we talk about when we talk about cloud computing“, this article gives great insight into Google’s vision of the Cloud. Again, I provide a highlighted version of the article which has items that particularly got my interest, specifically:

  • Google’s approach at virtualization using “stripped down” servers and tying the drones to a brain
  • Reliable software enables the creation of a more robust platform
  • “The reality is that most businesses don’t gain a competitive advantage from maintaining their own data centers.” (I couldn’t have said it better myself!)
  • Running a self-installed virtualization model still requires licensing, maintenance and implementation costs: human, hardware and on paper
  • Cloud offers faster innovation than DIY models

Sheth concludes with a poignant question: “As companies weigh private data centers vs. scalable clouds, they should ask a simple question: can I find the same economics, ease of maintenance, and pace of innovation that is inherent in the cloud?

Lastly, Lew Moorman posted a rebuttal of sorts on the Mosso Blog titled “McKinsey Misses The Bigger Point On Cloud Computing“. As with the previous articles, I have a highlighted version available. Moorman flat out states that McKinsey missed the bigger point and “underestimated the benefits of cloud computing.” Moorman presents 3 different characteristics of the cloud than McKinsey does, and frankly, my own definition is more slanted to his than to McKinsey’s. Both have warrant however. Moorman correctly points out how McKinsey blurs IaaS and PaaS into one. This is something that I will discuss here shortly. I definitely recommend reading the non-”back-of-the-envelope” analysis. I do know who pays $14,000 for a Windows server over 3 years, the Military. All joking aside, there are the OpEx costs of keeping said server up and running, something McKinsey does touch upon, but not thoroughly enough. I will (as I said) leave the TCO/TCA analysis to the experts, though. Lastly, Moorman appropriate says that time (and money) is better spent in leaving the infrastructure to experts, letting companies focus on their core competencies.

While I cannot divulge specifics, it is important to note here that at GoGrid, our own utilization rate far exceeds the “best possible” ones listed by McKinsey.

My Shots from the Peanut Gallery

McKinsey is a well thought-of firm and I have full respect for their research, analysis and findings, but it’s always important to get in other perspectives when confronted with potentially deceiving findings. So let’s dive right in.

“Irrational Exuberance” and Unrealistic Expectations

Granted, McKinsey tips their hat to the cloud having “great potential” but goes on to attempt to stifle what many consider to be one of the most important technology shifts in recent years. True, clouds came out of nowhere, creating what many believe to be a marketing storm. Everyone jumped on the bandwagon, trying to define (and be the source of the ultimate definition) of Cloud Computing. We dashed under the rain of buzzwords last year, trying to make sense of the madness. Page 10 of their study shows a list of 22 cloud computing definitions (mine is listed there). As with anything new, people need to be able to put their heads around it, smell it, taste it, touch it. When you talk about something virtual, this is increasingly difficult to do. Thus, the buzzword and definition madness ensued. (We, at GoGrid, even created a home-grown video to help people understand it and make it tangible. It has over 43,000 views as of this writing.)

My issue with the statement of over exuberance and unreal expectations is, what other way could it have happened? Anything that is hot now moves much more quickly than we had been used to. Take Twitter as a clear cut example. It’s mainstream. There are strategies and business being built around it. The Social Networking movement could be equated to that of the Cloud if you think about it. In my mind, this is no different. Both are exciting and new, full of promise, but wrought with growing pains, naysayers and disbelievers. So, would McKinsey say the same thing about Social Media?

Gartner's Hype Cycle

And where on the Gartner “Hype Cycle” are Twitter or Cloud Computing? I dare say that with Twitter, we are on the Slope of Enlightenment. I believe the Trough of Disillusionment was passed a while ago and mainstream adoption is taking place. But the curves are somewhat distorted due to the speed of which Social Media has spread. For the Cloud, I’m not sure. I almost feel the Trough has been leapt to some extent due to the rapidity of the movement as well, and we are on the way to “seeing the light.” But I’m a forward thinker. (These two topics are great subjects for later discussion.)

Their Definition of a Cloud is Blurred

What I find fascinating is that despite the “hundreds” of definitions of Cloud Computing in the blogosphere, McKinsey still couldn’t nail it down, and, in the process took a shortcut in their definition. For starters, there is no mention of “self-service” within McKinsey’s definitions. This is fundamental to the cloud and a key differentiator between traditional infrastructure deployments or virtualization, for that matter. As Randy Bias pointed out to me, “more important than embracing virtualization in the enterprise is building ‘private clouds’ that have self-service models.” This is an important topic to consider. If an enterprise works a virtualization strategy, will it be done in a way where it is entirely self-service in nature? Private clouds (using virtualization) will take time to get to where public clouds already are.

I actually somewhat agree with their three cloud characteristics of: 1) “hardware management is highly abstracted from the buyer,” 2) “buyers incur infrastructure costs as variable OPEX”, and 3) “infrastructure capacity is highly elastic (up or down)”. One should be careful, however with point #3, especially with mixing capacity and elasticity with scalability. I believe it would have been important for McKinsey to discuss elastic capacity component a bit further than “capacity can be scaled up or down dynamically, and immediately, which differentiates from traditional hosting service providers.” This is a somewhat limiting statement, and hosting vendors like ServePath/GoGrid are already proving it wrong (see “Cloud Connect“).

Generally, there are two types of scalabilities: vertical and horizontal. Briefly, vertical scaling represents adding more to the “boxes” you have (e.g., RAM or Storage): build “up” by putting more horsepower into what you already manage. Horizontal scaling means that you add more infrastructure (e.g., servers) side by side, essentially building “out.” This is much the way that Google sets up their shop, by adding more boxes horizontally to increase compute capacity. For a good primer on scalability, I encourage you to read Randy Bias’ WhitepaperScaling Your Internet Business.”

Categorization of Clouds is Blurred as well

With a broad stroke of the sword (or pen), McKinsey chopped the Cloud in half: “Clouds” and “Cloud Services”. This is the satellite view of things, not even a 10,000 foot view. By focusing the lens a bit more (as well as reading what others had written), one would know that there are 3 distinct “layers” to Cloud Computing, even more if you fine-tune the focus. I’ve articulated this before (and it is even in their presentation under the multiple definitions) in the form of the “Cloud Pyramid.”

Cloud-Triangle_plain

You simply cannot lump IaaS and PaaS together. They are different beasts. Sure they share some commonalities in that they fit the broad definition of the cloud. (At GoGrid, we currently define Cloud Computing as: “on-demand self-service Internet infrastructure where you pay-as-you-go and use-only what you need, managed by your browser or application.“).

But the differences between IaaS and PaaS are important and cannot be simply brushed aside. As I wrote previously (“Navigating the Layers of the Cloud Computing Pyramid“) and I simplify here, Platform Clouds (like Google App Engine and Microsoft Azure) have the OS and Frameworks managed by the provider whereas Infrastructure Clouds (like GoGrid and Amazon Web Services) boil things down further to the hardware and networking protocols, for example. Even within IaaS there are clear distinctions (Cloudcenter vs Infrastructure Web Services – read about how we differentiate these terms here and here). So why does McKinsey overlook this? I’m not entirely sure. I won’t attempt to draw any conclusions either.

The fundamental point here is, you cannot simply split the cloud in half (Clouds vs. Cloud Services), at a minimum it must be in thirds (Application, Platform & Infrastructure). Finally, I don’t quite understand why Cloud Services (loosely defined by McKinsey as “SaaS”) shouldn’t have the same characteristics of Cloud Computing in general. True, they don’t incur “infrastructure costs”, but they are subject to recurring costs through licensing, monthly usage or seats. And McKinsey is correct that SaaS can be built on top of other layers of the Cloud (Pyramid). In fact, the same could be said for Platforms being built on top of Infrastructure Clouds. It may not be the most cost-effective method to build a Cloud Platform, but it is definitely viable.

Service Level Agreements are NOT a Hindrance nor a Crutch

As Gartner’s Lydia Leong points out, there is a difference between engineering for reliability and actually attaining it. According to Leong, “most enterprise data centers have mathematical uptimes below 99.99% (i.e., calculated mean time between failure)”. To take this a bit further, SLA’s definitely should be viewed as a requirement for any company doing due diligence when choosing a hosting provider, cloud or not.

The cloud is obviously under an enhanced level of scrutiny due to its newness and promises. When any Cloud hiccups, whether it’s SalesForce, Google App Engine or GoGrid, people hear about it and instantly say “See? The Cloud can’t maintain 5 – nines of uptime!” and suddenly the cloud is evil. But failures happen at all levels of infrastructure and technology and it is not unique to the cloud. Having a provider that offers a solid SLA is critical (e.g., GoGrid offers one of the most robust SLAs in the hosting industry). But, if one were to follow McKinsey’s recommendation and “virtualize everything internally,” how will SLA’s be honored there? There are no clear standards in the hosting industry that dictate how uptime is calculated. It is frequently left up to the vendor to interpret, and subsequently back up through an SLA. This is increasingly obscure when it comes to private datacenters. And what are the consequences of not attaining 5-9′s in a private datacenter? No more “Casual Fridays”?

The cost to attain an incredible uptime record is high. It is inherently tied to a vendor’s reputation and can dramatically affect sales and customer stickiness. But any smart IT person knows that failures happen and an even smarter IT person strategically plans for it through a better architecting of their solution. SLA’s help (if using a hosting vendor), but they will NOT solve architectural mistakes; doing it all in-house under a vaguely defined SLA or uptime standard can be cost and time prohibitive.

Disputed Cost Savings

I personally would love to have $20 million sitting around to set up my own datacenter. I’m sure that there are many SME’s who would love to have that liquidity and cash availability as well. In this day and age, being able to plop down even a fraction of that amount to create or re-architect a datacenter is practically a pipe-dream. Again, if you virtualize your offering, you may only realize partial success. Also, as Leong points out (see yellow highlights), there is much more value to be gained through using the cloud than attempting to DIY (do it yourself), AND, it depends on the size of your organization. Larger corporations have efficiencies of scale (hopefully) as compared to a SMB. Obviously, when re-engineering an IT strategy, one has to pay careful attention to what is gained and lost through keeping the status quo, re-architecting through internal virtualization or moving to the cloud.

There are some definite uses of public clouds for businesses to capitalize on now, namely:

  • Small Business – when hiring a full fledged Ops team or hand-building your infrastructure is cost prohibitive (convert CapEx to OpEx)
  • Medium Business – when you need to dynamically control costs as well as compute capacity. Elastic clouds can do this for you.
  • Large/Huge Business – outsourcing is important to non-critical infrastructure like QA/Dev/Skunk works or as a failover strategy

My stance on this is, the most effective strategy is one that is hybrid in nature. Corporations should look to see what must be moved to the cloud, what must stay in-house and what falls into the grey area. Even within the cloud, there are scenarios that simply are better handled through integrating dedicated or colocated infrastructure (and the benefits therein) with a cloud front-end (see GoGrid’s Cloud Connect). Planning for “up” and “out” scalability using the elasticity of the cloud, coupled with the performance of dedicated hardware is a must when considering strategies.

And what about “innovation” as Golden pointed out? Cloud providers can clearly innovate much more rapidly than an in-house IT team can, especially given the virtualization, budget, time and monetary constraints when doing internally.

Lastly, John Keagy, CEO/Co-Founder of GoGrid, poses this question: “What is the opportunity cost of spending $20MM on a non-core activity like a datacenter?” Realizing this capital investment into a profitable business unit is potentially attainable by few, impossible by most.

The F.U.D. Factor

Honestly, I thought we were past this phase but I guess not. As with anything new, there are those who are quick to point out that the cloud is not the panacea of all things. I agree with that somewhat. However, when McKinsey draws a comparison to the 2000 dot-com bubble, I get a bit aggravated.

McKinsey truly brings out the Fear, Uncertainty and Doubt factor into play with this analogy. The three bullet points (paraphrased to: huge investments made based on hype, inability to generate profits, and NASDAQ losing 80% of value) simply do not come into play here. During the dot-com bubble, money was thrown around loosely and given to anyone who had even a half-baked business plan. Money was not wisely spent and business had little to show during and thereafter. In this day and age, before any funding is given, companies not only must have a fully vetted business, they also have to have an installed user base and definitely signs of profitability. Lets not forget about the “transparency” buzzword du jour. And this time around, the stock market drop has little or nothing to do with the technology sector.

To equate the cloud movement to the dot-com bubble is naive at best. Try to tell Amazon that what they have been doing for a couple of years now is just a fad and will not last. Need I say more?

</end commentary>

Perhaps I had too much coffee while I wrote this. Or maybe woke up on the wrong side of the bed. One part of me is truly happy that McKinsey came out drawing some clear lines in the sand. Their “non-partisan” analysis is definitely required to help put things in perspective for companies evaluating their IT strategies. The other part of me, however, wishes that they hadn’t been so pessimistic and potentially misleading.

If you haven’t read their study, I encourage you to do so. Read and analyze it and give me your interpretation.


Over the past year, I have written about the various primal layers of Cloud Computing. Typically, my role is to “over simplify” in order to make the Cloud a bit more palpable by “the masses.” My colleague, Randy Bias, is the resident über-tech, so I usually leave the more complicated developer and sys-admin posts to him. As we all know, the Cloud is hot and becoming increasingly complicated as new products, services and vendors throw their hats into the ring. But is this over-complication confusing and saturating the market? I think not, in terms of the latter, but it is truly becoming more confusing.

Cloud-Triangle_plain

First, we at GoGrid, broadly define Cloud Computing as such (latest definition):

On-demand self-service Internet infrastructure where you pay-as-you-go and use-only what you need, all managed by a browser, application or API.

Even that definition I feel is a bit skewed toward Infrastructure. Probably more aptly defined, it would be:

On-demand, self-service Applications, Platforms, Services or Infrastructure dynamically consumed on a pay-as-you-go basis using a browser, application or API.

Definitions evolve and morph over time. This is probably the 30th iteration of our definition over the past year.

So I will circle back to the Cloud Pyramid (as seen below):

To briefly recap the different layers:

  • Cloud Applications – many view this layer as containing SaaS (Software as a Service). It’s important to remember that not all SaaS offerings fall into this category. SaaS existed well before the term “Cloud” came into play. Essentially, the idea is that Application functionality is served via the internet and this application typically does one thing. This could be email (e.g., Gmail) or CRM (e.g., SalesForce).
    • Advantages – available via a web browser, rich interfaces, frequently free or paid either by monthly usage or seat licenses
    • Disadvantages – little or no customization available, limited to the feature-set provided
  • Cloud Platforms – otherwise known as PaaS (Platform as a Service). Typically a development language or framework (e.g., Ruby on Rails, Python, .NET, Java) is contained within this environment. What this means is that users consume the hosted framework. Examples are EngineYard (a RoR  stack hosting environment), Google App Engine (supporting the Python framework), Microsoft Azure (running .NET framework) and Force.com (proprietary SalesForce.com framework) for example.
    • Advantages – the frameworks are hosted by vendors. This means that the underlying infrastructure is controlled, updated and managed by Cloud Platform vendors.
    • Disadvantages – while offering significant more control over the development environment, because the underlying infrastructure is not available to the end-developer, these developers are “at the mercy” of the hosting provider to ensure updates and management of the various framework stacks are fully functional, updated and accurate.
  • Cloud Infrastructure – this is called IaaS (Infrastructure as a Service). At the lowest layer of the Cloud Pyramid, infrastructure is delivered and consumed on-demand utilizing some sort of paravirtualization and/or hardware integration. This layer includes servers, networks and other hardware appliances (e.g., load balancers) delivered as either Infrastructure Web Services (e.g., Amazon Web Services) or as “cloudcenters” (e.g., GoGrid). More information about the differentiation we make between Infrastructure Web Services and Cloudcenters is discussed in the posts here.
    • Advantages – full control over the various components of infrastructure means that you can work with the infrastructure in just about any way you desire. It also lays a fundamental groundwork for building other Clouds on top of it (especially Cloud Applications)
    • Disadvantages – sometimes more expensive compared to the other layers; if you aren’t familiar with full access to infrastructure, controlling and managing could be daunting.
  • Other Cloud Services – there are many types of ancillary Clouds that are showing up including: Cloud Services, Cloud Storage, Cloud DB, Cloud Aggregators, Cloud Extenders, Cloud Management, etc.

Obviously, this just scraped the surface of the Cloud. But let’s take a quick look at one particular “application” which can span all layers of the Cloud.

exch_logo_alt

Microsoft Exchange is probably something that many of you are familiar with. For the uninitiated, Microsoft Exchange is a messaging and collaboration application developed by Microsoft and is contained within its line of server products. Functionality includes email, calendaring, contact management and tasks. Why have I chosen Exchange as a good example to use for traversing the various layers of the Cloud? Simply because some form of Exchange can conceivably exist at each layer. Let’s explore Exchange on the…

  • Cloud Application layer – you want all of the functionality of an Exchange account but don’t want to worry about the management thereof. At this level, you get just that, the ability to have a hosted Exchange mailbox with many of the bells and whistles of a standard corporate Exchange account but without the fuss. Billing for this is typically by user by month, typical of many SaaS or Cloud Applications.
  • Cloud Platform layer – suppose your corporation outgrows a simply leasing of individual Exchange mailboxes as present within the Application layer, you can then opt for a solution of a dedicated Exchange server, hosted by a provider. While more of a dedicated play, conceptually, the idea is the same. You choose a hosting provider who manages the infrastructure (the experts) and “frameworks” and you simply administer the usage and functionality therein. Providers integrate other functionality (e.g., web access) into the product offering while still protecting the underlying infrastructure. Companies have more control at this layer.
  • Cloud Infrastructure layer – assuming that your company has grown to the point where you need a more robust corporate infrastructure, you probably are looking at setting up a clustered Exchange environment. At this layer, you would need to have full access and control over your infrastructure in order to set up Active Directory and other protocols. Hosting with dedicated, the cloud or a hybrid solution (e.g., using Cloud Connect)  is the best implementation here.

With the Cloud, you can grow your infrastructure based on the demand and needs of your company. The Microsoft Exchange example can just as easily be applied to moving your Web Application through the different layers of the cloud as well, for example a CRM application.

While this is not exactly a true “Cloud play,” I hope that it helps to explain how the layers of the Cloud Pyramid are differentiated in terms of control, scalability and functionality. You could also use an example of Ruby on Rails or Python (at least for the bottom two layers). With Google App Engine or EngineYard, you work within the languages and frameworks available. If you move down to the Infrastructure layer, you can do that as well, but you also have the ability to control the underlying infrastructure and customize the framework environments to your liking. Unfortunately, I’m a bit hard pressed to explain how frameworks can be utilized at the Cloud Application level, but I’m open to other comparisons or examples.

What other applications or environments can traverse the Cloud Pyramid? I’m sure there are many!


This morning we announced that Appistry EAF Community Edition has been released within the GoGrid cloudcenter infrastructure. The press release can be viewed here. Full contents of the release are below.

gogrid-appistry-server.png

Appistry and GoGrid Announce Commercial Availability of Joint Cloud Computing Solution for Delivering Highly Scalable and Reliable Server Applications

Cloud Computing Infrastructure provider GoGrid and Cloud Application Platform provider Appistry announce the release of Appistry EAF Community Edition within the GoGrid cloudcenter.

San Francisco, CA February 26, 2009 — GoGrid, the Cloud Computing division of ServePath, LLC and Appistry today released new tools for developers, architects and administrators designed to ease the pain associated with developing, deploying and managing applications in the Cloud. Appistry’s Cloud application platform, named Appistry EAF, helps businesses and enterprises efficiently manage and scale their applications within the GoGrid infrastructure. With this joint solution, larger companies are able to take full advantage of the Cloud’s unique value proposition of elastic scalability, solid reliability, automated management and CapEx economies.

Appistry EAF Community Edition 3.9 is now available for Red Hat Enterprise Linux 5.1 users. Additional EAF-enabled GoGrid images will be rolling out in the near future. Appistry EAF Community Edition allows developers, system architects and administrators to take advantage of Appistry’s Cloud application platform for free on up to five GoGrid Cloud Server instances. Appistry EAF functionality and benefits include:

  • Transparent and instant linear scalability
  • Application-level fault tolerance
  • Broad support for Cloud-enabling software components
  • Adaptive, software-based load balancing
  • Fully-distributed, fault tolerant memory cache for objects and data
  • Fine-grained, hierarchical security model
  • Efficiencies in CapEx and administrator time
  • Ease of use

More information on Appistry EAF can be found at: http://www.appistry.com/products/eaf/index.html

“The GoGrid partnership is part of Appistry’s strategy to address the complex challenges enterprises face developing, deploying and managing applications in both public and private Clouds,” said Sam Charrington, Appistry vice president of product management and marketing. “End-users demand a platform which sits above the infrastructure and allows enterprises to more easily realize its full promise — elastic scalability, solid reliability and automated management.”

The combination of GoGrid’s robust and flexible Cloud Computing infrastructure and Appistry’s Cloud application platform enables enterprises to capitalize on the inherent advantages of both technologies. GoGrid leads the Cloud infrastructure space with a full assortment of infrastructure capabilities available in the Cloud, including industry standard and best practice implementations of Windows Server 2003 and 2008, Microsoft SQL Server, Red Hat Enterprise Linux and CentOS instances among others, as well as free hardware-based f5 load balancing and hybrid hosting capabilities with Cloud Connect which is particularly efficient for complex Microsoft SQL Server databases.

“The GoGrid and Appistry partnership clearly demonstrates our commitment to helping businesses optimize their infrastructure to gain the advantages of Cloud Computing,” said GoGrid CEO, John Keagy, adding “Companies would be foolish to not optimize their business and technology strategies using the power of Appistry EAF and GoGrid’s Cloud infrastructure.”

About GoGrid (http://www.gogrid.com)

GoGrid is the leading Cloud Computing, hosted, Internet provider that delivers true “Control in the Cloud™” in the form of cloudcenters. GoGrid enables system administrators, developers, IT professionals and SaaS (Software as a Service) vendors to create, deploy, and control load balanced cloud servers and complex hosted virtual server networks with full root access and administrative server control. GoGrid server instances maintain the industry standard specifications with no requirement to learn and adapt to propriety standards. Bringing up servers and server networks takes minutes via a unique web control panel or GoGrid’s award winning API. GoGrid delivers portal controlled servers for Windows Server 2003, Windows Server 2008, SQL Server, ASP.NET, multiple Linux operating systems (Red Hat Enterprise and CentOS) and supports application environments like Ruby on Rails. Free f5 hardware load balancing and other features are included to give users the control of a familiar datacenter environment with the flexibility and immediate scalability of the cloud, a “cloudcenter.” GoGrid won the coveted 2008 LinuxWorld Expo’s Best of Show award.

About ServePath (http://www.servepath.com)

ServePath, a Microsoft Gold Certified Partner, is the leading managed and dedicated hosted server provider, delivering custom solutions and managed services to businesses that require powerful Internet hosting platforms for their production environments. Thousands of companies worldwide look to ServePath for its reliability, customization, and speed. ServePath has a Keynote-rated A+ network and guarantees uptime with a 10,000% guaranteed™ Service Level Agreement. The employee-owned company has been in business for nine years and operates its own San Francisco data center and is SAS70 Type II certified.

About Appistry (http://www.appistry.com)

Appistry simplifies cloud computing for the enterprise, opening the door to more agile and scalable IT environments. Appistry’s application platform delivers solutions for the complex challenges of building, deploying and managing a wide variety of applications and services for both public and private clouds. Appistry’s products are designed specifically for cloud environments, delivering transparent scalability, application-level fault tolerance, and automated management to new and existing applications. Appistry customers include FedEx, GeoEye, Lockheed Martin and Northrop Grumman. For more information about Appistry, please visit www.appistry.com.

Information on Appistry and other GoGrid Partners can be found here. For Appistry support-related questions, visit the Appistry site or see the Appistry partner page.


By now, many in the Cloud Computing space have heard about (or even read) the University of California Electrical Engineering & Computer Science’s (EECS) study on Cloud Computing titled: “Above the Clouds: A Berkeley View of Cloud Computing.” Published on February 10th, 2009, the EECS’s paper provides a seemingly academic study of the Cloud Computing movement, attempts to explain what Cloud Computing is all about, and identifies potential opportunities as well as challenges present within the market.

The 20+ page study is authored by Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy H. Katz, Andrew Konwinski, Gunho Lee, David A. Patterson, Ariel Rabkin, Ion Stoica and Matei Zaharia who all work in RAD Lab. (Interestingly, several of the companies mentioned within the study are also Founding Sponsors and/or affiliate members: Sun, Google, Microsoft, Amazon Web Services, etc.).

There has already been plenty of discussion and analysis of this study (by James Urquhart, Krishna Sankar and has even appeared on Slashdot.org). Needless to say, I felt compelled to get my two cents in, especially from the perspective of a Cloud Computing Infrastructure vendor.

EECS_banner

From an academic standpoint, this document definitely has some legs. It is complete with carefully thought out scenarios, examples and even formulae, as well as graphs and tables. Some of the points that are brought up even got me scratching my head (e.g., using flash memory to help by “adding another relatively fast layer to the classic memory hierarchy”). Even the case analysis of a DDoS attack from a cost perspective of those initiating an attack to those warding off an attack on a Cloud was interesting to ponder. I commend these group of authors on undertaking such a grand task of not only writing by committee but also overlaying a very business school vs. mathematics and computer sciences approach to the writing and analysis.

Unfortunately, however, as I read through the document, I started scrawling madly in the margins with commentary that is somewhat contrary to what was written within the study.

A Few Comments from the “Peanut Gallery”

I don’t want my article to come off as a complete rebuttal to what is written in this study. Quite the contrary. I’m encouraged that one group within the academic community has taken considerable time and effort analyzing and writing about the Cloud. What appears below is a small “laundry list” of things that need to be called out and is a mixture of positive and negative comments:

  • EECS’s Cloud Computing definition – “Cloud Computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the datacenters that provide those services. The services themselves have long been referred to as Software as a Service (SaaS), so we use that term. The datacenter hardware and software is what we will call a Cloud.[1]
    My comments: I personally found this definition to be incomplete and potentially misleading. While the EECS is correct in including SaaS (Cloud Applications) as a subset of Cloud Computing, they have (consciously?) lumped everything else into a catch-all phrase of “hardware and system software.” For people to truly understand Cloud Computing, I feel that it is important to become much more granular in defining the layers of the Cloud (Cloud Applications, Cloud Platforms and Cloud Infrastructure – the “Cloud Pyramid”, a term I coined last year). I actually found it interesting that the group of authors couldn’t agree what the precise differences between the “X as a Service” were.[2] In order for all of the assumptions and conclusions to take place, I would have thought that clearly defining what the “Cloud” is would be paramount to the success of the findings.
  • 3 Important Technical Aspects of the Cloud – the group outlines three items of the Cloud: 1) “infinite computing resources” 2) “elimination of an up-front commitment” and 3) “pay for use of computing resources on a short-term basis as needed.”[3]
    My comments: For the most part, I agree with these statements. However, #3 is a bit skewed towards an Amazon EC2 model. At GoGrid, we are pioneering the idea of a “cloudcenter” (a datacenter in the Cloud) which presents a different paradigm. EC2 has long been touted as being a way for quick batch processing where instances are spun up, consumed and then discarded. This falls within the third aspect that is defined above. However, when you take the view of creating a “datacenter in the cloud,” there is less of a “quick use function” and more of a scalable infrastructure notion designed to replace traditional datacenters and associate infrastructures.
  • New Application Opportunities – several new or emerging opportunities designed to capitalize on the benefits of the Cloud are outlined: “mobile interactive applications,” “parallel batch processing,” “the rise of analytics, extension of compute-intensive desktop applications,” and “‘earthbound’ applications.”[4]
    My comments: I’m actually glad to see these so carefully explained as they do cover many aspects that are potentially “unique” to the Cloud: dynamic storage, dynamic availability, scalable processing and compute power, and cost-effectiveness to name a few.
  • Classes of Utility Computing – Amazon’s EC2 is at one end of the spectrum and Google AppEngine and Force.com is at “the other extreme” with Microsoft Azure falling somewhere in the middle. Also, “virtualized resources” are broken up into 3 classes: Computation, Storage and Networking[5]
    My comments: For starters, since the group was unable to fully define the Cloud “spectrum,” it’s difficult to understand how they place EC2 at one end and having the spectrum “end” at Cloud Platforms (e.g., Force.com or AppEngine). The “full” spectrum must include SaaS as well as PaaS and IaaS in order to fully encompass the definition. Gmail and SalesForce exemplify SaaS and definitely should be contained within the Cloud mantra. Microsoft Azure, Force.com and Google AppEngine are truly Cloud Platform. Perhaps within the Platform layer, Azure and AppEngine are far between, they do, however, occupy the same Cloud space of “here is a development environment, you must work within it” (e.g., Python, .NET). Cloud Applications are simply “here is a web-based software application that is available for consumption and you have minimal flexibility in terms of controlling it.” Lastly, Cloud Infrastructure works as “enjoy full control over your infrastructure despite the fact that it is a bit more challenging to control.” For the most part, the 3 virtualized resources do fall within what is outlined. Storage can be expanded to include “Cloud Storage” (dynamic), “Persistent Storage” (traditional) and “Volatile or Temporary Storage” (typically associated with EC2 instances where storage disappears when the EC2 instance is destroyed or goes down).

I could probably nitpick through some other items, but I will leave that up to you.

The Cloud Pyramid

Comments from a Cloud Vendor perspective

In Section 7 of the study, the EECS group presents “10 Obstacles and Opportunities for Cloud Computing” which definitely should be addressed. For this section, I’m putting on my “GoGrid Green” colored glasses and presenting points and counter-points to each of the 10 items outlined. Again, this is not intended to come off as a ping-pong match, but rather a commentary and opportunity for dialog. I encourage you to read this section prior to reviewing my responses. I have tried to briefly paraphrase each item (but that probably doesn’t do it justice).

  1. Availability of a Service – “will Utility Computing services have adequate availability”[6]
    My Response: The study outlines outages specific to the Cloud, citing S3, AppEngine and Gmail in particular. I have said this before, outages happen and they are not unique to the Cloud. Natural and human-caused disasters occur. Hurricanes and cable cuts can affect all sorts of infrastructure. As with a traditional datacenter, in-house or outsourced, traditional or in the Cloud, a disaster failover and redundancy strategy should be part of an IT department’s general strategy for success or just survival. One thing to consider is mirroring or creating redundancy on different types of infrastructures: if your primary is in the Cloud, have a dedicated failover; if your colo is on the East Coast, think about something on the West. Also look beyond simply the service and review the Support organization, the Service Level Agreement (SLA) and the provider’s expertise within the field. GoGrid, for example, has 24×7 Free support, the most robust SLA of any Cloud provider and over 9 years of hosting experience and expertise.
  2. Data Lock-in – “the API’s for Cloud Computing itself are still essentially proprietary”[7]
    My Response: Unfortunately it seems that GoGrid’s announcement back in January of this year where we discussed how our GoGrid cloudcenter API has been put under a Creative Commons Sharealike license was somehow overlooked when compiling facts for this study. Our idea behind this move is to start working standards from the ground up. GoGrid is also an active participant in many of the interoperability meetings around the country. Part of the reason why we released our API to the community at large is to demonstrate our commitment to open standards. We also have modeled the GoGrid cloudcenter extremely closely to a traditional datacenter where all of your hardware, protocols and connectivity is familiar. This helps lessen the “lock-in” scenario and avoids the use of proprietary API’s and other components. Also mentioned is “surge computing” which is another term for “cloud bursting” or “hybrid” clouds. Our Cloud Connect offering works exactly in this way, where users can opt to have high-end, large I/O databases, for example, reside within a traditional, managed hosting environment (through ServePath, our parent company). Cloud Connect allows for scalable and dynamic web front-ends, hosted in the GoGrid Cloud, to connect via a dedicate private network to higher-end servers in a managed hosting back-end.
  3. Data Confidentiality and Auditability – “current cloud offerings are essentially public (rather than private) networks, exposing the system to more attacks”[8]
    My Response: The statement above is rather alarmist in nature. I agree that many efforts should be made to ensure the resiliency and security of the Cloud, and these efforts are well underway at GoGrid as well as other Cloud providers. Again, however, this is not something completely unique to the Cloud. Any hosting provider or datacenter (or cloudcenter for that matter) must ensure that security and the integrity of the network and infrastructure is maintained at a high standard. GoGrid, for example, is SAS70 Type II audited and certified. The EECS’s statement, however, is not a completely honest assessment. Public vs. Private datacenters, dedicated hosting or clouds are very different. The concerns of publically hosted infrastructures are really no different whether in the cloud or in a datacenter; they will both be inherently a bit more vulnerable. However, I would say that companies whose business it is to solely do hosting will potentially have more robust security protection and attack prevention measures in place than a self-hosted or even private cloud would. In terms of HIPAA compliance or Sarbanes-Oxley, there are stringent requirements of data protection, privacy and isolation. While it may be difficult to pass accreditation for these types of compliances “in the cloud”, using a feature like Cloud Connect, for example, allows for compliance to take place on a dedicated, warehoused set of servers within a traditional datacenter, something much more palpable and acceptable.
  4. Data Transfer Bottlenecks – “applications continue to become more data-intensive”[9]
    My Response: It’s all about the data, I agree. The Cloud is an ideal environment for statistical analysis and number crunching. I personally know of one GoGrid user who would spin up multiple instances of GoGrid servers, upload a huge amount of data, run some analysis programs and then export the resulting summaries, all in a matter of hours and only costing a few dollars. The arguments presented by the EECS group are true; until we get the ability to transfer large amounts of data through very big pipes at a extremely lost cost, this could be a barrier for those customers who may be considering the Cloud as a data eating machine. However, when we at GoGrid designed our business model, we kept scenarios like this in mind and came up with an easy solution: make all inbound data transfers free. This way, GoGrid users can upload large amounts of data to their cloudcenter, move that data around within the private network therein, put some on Cloud Storage should they desire, analyze to their hearts content and then download the summary or result sets (typically much smaller in file size than the data going in). GoGrid does charge for outbound but you can see how the pricing model works to the user’s advantage in analysis scenarios.
  5. Performance Unpredictability – “multiple Virtual Machines can share CPUs and main memory surprisingly well in Cloud Computing, but that I/O sharing is more problematic”[10]
    My Response: This is a very good point and difficult to fully refute. It’s true that CPU and RAM can be virtualized, managed and isolated extremely well. Disk I/O performance can suffer at times. Again, this is part of the reason we offer a solution for this with Cloud Connect (see previous statements). It is frequently better to offload extremely intensive I/O processes to a dedicated environment, at least until virtualization technology gets more aligned with bare-metal performance. We even released a “custom patch” for 64-bit Linux users on GoGrid that helps increase disk drive performance. While some may says that this is a bit non-standard, it does show our understanding of this concern and marks an effort to resolve or minimize the impact.
  6. Scalable Storage – short-term usage, no up-front cost and infinite capacity on-demand doesn’t apply to persistent storage[11]
    My Response: I have to agree somewhat to this idea, however it is a bit of an oxymoron. Persistent storage requires that it is dedicated in some way, available at all times and easily usable. On EC2, for example, if your instance dies, you lose any persistence of data, which is part of the reason why they recommend using S3 (their Cloud Storage offering). This is logical from so many standpoints: redundancy & share-ability are two that immediately jump to mind. Again, at GoGrid we took a slightly different approach by making all GoGrid Cloud servers have persistent storage available from the beginning. The amount of persistent storage is directly tied to the amount of RAM you have allocated: if you choose a higher RAM instance, you get more persistent storage. However, I don’t see scalable storage to be an obstacle entirely. Amazon offers S3 and GoGrid has a similar Cloud Storage offering. Both are scalable on demand, billed by usage and usable by Cloud Servers. GoGrid’s Cloud Storage is mountable as a drive and shareable among a user’s GoGrid servers within the GoGrid infrastructure using industry standard protocols (e.g., SAMBA, CIFS, RSYNC & SCP). To that end, in my mind it does meet the 3 properties outlined with the omission of the “persistent” adjective.
  7. Bugs in Large-Scale Distributed Systems – “one of the difficult challenges in Cloud Computing is removing errors in these very large scale distributed systems”[12]
    My Response: This is actually one obstacle that I fully agree with. Often it is difficult to “mirror” physical, large scale computing environments within the Cloud. Unfortunately, it is not an apples-to-apples comparison. One simply cannot just “port” a physical, complex infrastructure over to the Cloud. If you do, you will fail. You need to architect your Cloud environment capitalizing on the efficiencies and features of the Cloud. Otherwise, you simply translate (and potentially compound) issues existing previously further. Another thing to consider is that all Virtualization or Hypervisor technologies have bugs, as with any software for that matter. The complexity of a Cloud environment is multi-fold: at the hypervisor and management layer, the hardware layer of the grid or utility architecture, as well as within the VM’s themselves. This is a complicated and delicate environment. The good news is, because this is technology that is around to stay, and is consistently being built upon, refined and improved, the end results are only improvements. Important to this again is interoperability and standards, similar to the Wild West becoming civilized and engineered. Bugs will be squashed and efficiencies gained through increased R&D efforts as well as customer adoption and validation.
  8. Scaling Quickly – “automatically scale quickly up and down in response to load in order to save money, but without violating SLAs”[13]
    My Response:  This is one of the key value propositions of Cloud Computing. You must be able to scale up and down based on demand (or even based on a budget). Much of this can be done using API’s or companies like RightScale. As I mentioned previously, Design for the Cloud. Traditionally, companies over-bought their infrastructure, saving it all for a rainy day. At ServePath, we know for a fact that CPU, RAM and Storage on our dedicated machines are only hitting about a 5% utilization on average. Many companies have built up their infrastructure for the “what if” scenarios. These inefficiencies are part of the reason why Cloud Computing has become so popular, a panacea of sorts. When you design for the cloud, you must ensure that your strategy capitalizes on scalability, both up and down, but also on redundancy and persistence. Of course, it all depends on the type of system you are architecting (persistent – a store-front or content driven marketplace, or temporary – data analysis, bulk processing).
  9. Reputation Fate Sharing – “reputations do not virtualize well”[14]
    My Response: I feel that this fully depends on how a Cloud provider crafts their offering. The example given in the EECS study is that of blacklisted EC2 IP addresses due to spamming. This is a valid concern but is due to how AWS releases their public IP address back “into the pool” once an instance is removed or destroyed. At GoGrid, we took a different approach. For starters, all users are assigned a contiguous block of static public IP addresses. When a GoGrid user deletes a server, that public IP address is released back into THEIR pool and not a general pool. Thus, if an IP address gets flagged by a spam-prevention service as being “bad,” the “bad reputation” is contained within a particular GoGrid user’s environment and not the entire GoGrid user base. Similarly, by default, we block all outbound SMTP traffic by default. Users who wish to use this protocol must request this block be lifted. Also, while somewhat inconvenient, this one-time action does help to maintain a positive reputation for a vendor as a whole. Be sure to carefully review a vendor’s SLA, Terms of Service (TOS), Privacy Policy and Acceptable Use Policy (AUP).
  10. Software Licensing – “licensing models for commercial software is not a good match to Utility Computing” & “pay-as-you-go seems incompatible with the quarterly sales tracking”[15]
    My Response: Software licensing models are being forced to evolve to be able to handle the on-demand nature of the Cloud. While Amazon took the approach of increasing the hourly charge to handle licensing of Windows Server vs. an open-source alternative, GoGrid, in order to maintain simplicity, rolled it all into one (no difference between Red Hat, CentOS or Windows). Licensing of Microsoft SQL Server on GoGrid, for example, is handled through a monthly (not hourly) charge. This helps with both a customers budget projections as well as from our own sales projections. Simplicity in explanation and execution is critical. If your user is confused as to how the billing works or how to project what charges they will incur, they will not execute. Token billing, tied to hourly charges will also become increasingly prevalent.

Summing it all up

If you made it both through the EECS group’s study as well as this blog post, I truly commend you, and you hopefully have a better understanding of the Cloud Computing term and properties therein, especially from the standpoint of an academic institution and Cloud Computing vendor. While I have challenged a few of the statements made within the study, there are others that stand up just fine. The important overall idea here is that serious brainpower and resources are being thrown at the Cloud, from understanding and analysis standpoint to development and execution therein.

A special message to the EECS group: I would personally like to invite you all cross the Bay (from Berkeley to San Francisco) to come and visit a Cloud Computing provider who is already overcoming the obstacles you have outlined. We would love to have a round-table discussion about the Cloud and help you with the next version of this study.

  1. M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. Feb 10, 2009. “Above the Clouds: A Berkeley View of Cloud Computing.” Electrical Engineering and Computer Sciences. University of California at Berkeley. http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.html p. 4 []
  2. ibid. p. 4 []
  3. ibid. p. 4 []
  4. ibid. pp. 7-8 []
  5. ibid. pp. 8-9 []
  6. ibid. pp. 14-15 []
  7. ibid. p. 15 []
  8. ibid. pp. 15-16 []
  9. ibid. pp. 16-17 []
  10. ibid. pp. 17-18 []
  11. ibid. p. 18 []
  12. ibid. p. 18 []
  13. ibid. p. 18 []
  14. ibid. p. 18 []
  15. ibid. p. 19 []

calendar 2008 was an action-packed year for us here at GoGrid and ServePath and we have many accomplishments to be proud of. I thought it would make sense to reflect back on what major things we did over the year as well as a few other notables that happened within the industry. The easiest way for me to do this is through a blog post Chronology (not every post is highlighted):

1st Quarter 2008

  • 01.03.08 – GoGrid Blog was launched
  • 01.29.08 – “Sneak Peak” at GoGrid
  • 02.01.08 – Twitter and Joyent go different ways
  • 02.05.08Understanding “Clouded” Computer Terms – a post that made a 1st attempt to explain Cloud, Utility, Grid and other Computing terms.
  • 02.13.08 – Dilbert does a series on Virtualization (here, here and here)
  • 02.15.08 – Amazon’s S3 has major outage (my comments)
  • 02.21.08 – GoGrid launches a new public website in anticipation of the product launch
  • 03.11.08GoGrid Public Beta LAUNCH! After over 2 years of development, GoGrid hits the streets with many Cloud Computing firsts:
    • 1st Cloud Infrastructure provider with a Web GUI
    • 1st to offer Windows Server 2003 in the Cloud
    • 1st to offer Microsoft SQL Server in the Cloud
    • 1st with free Inbound Transfer
    • 1st with free f5 Load Balancing
    • 1st with free 24×7 Support
    • 1st with Persistent Storage
    • 1st with free managed DNS
    • 1st with 100% Uptime SLA
    • 1st with public and private VLANs
  • 03.17.08Drilling down on the details of new GoGrid accounts
  • 03.18.08 – Even I wasn’t initially on board with the whole “Cloud Computing” term. My thoughts have changed obviously.
  • 03.28.08 – The initial GoGrid FAQ’s start rolling out.

2nd Quarter 2008

3rd Quarter 2008

  • 07.07.08 – GoGrid hits 1000th user and coverage by TechCrunchIT
  • 07.17.08 – GoGrid launches API
  • 07.18.08 – NetworkWorld, C|net & TechCrunchIT cover GoGrid’s new API
  • 07.21.08 – InfoWorld does a side-by-side comparison of GoGrid, Amazon’s EC2 and Google App Engine
  • 07.22.08 – Teens-in-Tech founder, Daniel Brusilovsky, interview of GoGrid
  • 07.31.08 – Google Web Toolkit (GWT) showcases GoGrid
  • 08.06.08 – GoGrid WINS LinuxWorld 2008 Best of Show in Product Excellence
  • 08.19.08 – GoGrid is the FIRST to launch Windows Server 2008 in the Cloud
  • 09.09.08 – the first NoHardware.com video is released
  • 09.16.08 – Financial Markets start getting very shaky. Cloud Computing can help stabilize.
  • 09.17.08 – GoGrid and RightScale partnership announced
  • 09.22.08Feature preview of GoGrid’s Cloud Storage (now live)
  • 09.23.08 – the second NoHardware.com video is released
  • 09.29.08 – The “Original” Cloud Computing in Plain English produced in-house by GoGrid launches
  • 09.30.08 – GoGrid and Appistry partnership announced

4th Quarter 2008

Happy New Year to all of you from us at GoGrid. May 2009 be happy, healthy and prosperous!