The recent McKinsey report “Clearing the air on cloud computing” has caused quite a bit of stir within the cloud community, and I can see why. While it definitely brings a good deal of analysis to the table, I feel it is somewhat generalized, makes assumptions and does overlook some key points.
First and foremost, this article is NOT going to be an analytical discussion of the cost of running or setting up a datacenter vs. an Amazon EC2 Windows instance. I’m not a financial analyst. Honestly, calculating the Total Cost of Assets (TCA) or Total Cost of Operations (TCO) causes my eyes to roll back into my head leaving me gasping for air. Don’t get me wrong, it seems like some good effort was made analyzing data and formulating conclusions. The problem is, I feel that they were on a jetliner, shooting through the clouds with the shades 1/2 down.
Before I start with my own analysis and commentary, I would like to reference a few responses I have read that somewhat chastise McKinsey.
Three “Rebuttal” Articles to Read
The first comes from CIO IT Drilldown’s Virtualization site. In his article “McKinsey Cloud Computing Report Conclusions Don’t Add Up,” Bernard Golden does the major lifting for me in terms of analysis. I have highlighted some key points from the article that I viewed to be particularly important (my highlighted version of the article is here). I particularly enjoyed Golden’s rebuttal to the analysis of cost calculations, namely use of EC2 Windows instances, headcounts that don’t add up and other “less visible” capital expenses for facilities and other assets. Also as Golden points out, McKinsey proposes that better efficiencies and savings can be realized through virtualization within the organization. To me, the McKinsey recommendation seems a bit counter-intuitive: “Don’t go with a vendor whose expertise IS virtualization, hardware, infrastructure, et al. Instead, DO try to do it yourself, with tremendous CapEx & OpEx expense.” Hmmm, makes sense to me, NOT! Lastly, I particularly liked Golden’s 3 recommendations (quoted from article):
- Review your portfolio of applications to understand what cloud computing means to you.
- Create a viable financial model for assessing the true costs of internal hosting.
- Evaluate the potential for an internal cloud even if the numbers don’t work with an external cloud provider.
Another good article comes from the “Official Google Enterprise Blog”, posted by Rajen Sheth, Sr. Prod. Mgr for Google Apps. Titled “What we talk about when we talk about cloud computing“, this article gives great insight into Google’s vision of the Cloud. Again, I provide a highlighted version of the article which has items that particularly got my interest, specifically:
- Google’s approach at virtualization using “stripped down” servers and tying the drones to a brain
- Reliable software enables the creation of a more robust platform
- “The reality is that most businesses don’t gain a competitive advantage from maintaining their own data centers.” (I couldn’t have said it better myself!)
- Running a self-installed virtualization model still requires licensing, maintenance and implementation costs: human, hardware and on paper
- Cloud offers faster innovation than DIY models
Sheth concludes with a poignant question: “As companies weigh private data centers vs. scalable clouds, they should ask a simple question: can I find the same economics, ease of maintenance, and pace of innovation that is inherent in the cloud?”
Lastly, Lew Moorman posted a rebuttal of sorts on the Mosso Blog titled “McKinsey Misses The Bigger Point On Cloud Computing“. As with the previous articles, I have a highlighted version available. Moorman flat out states that McKinsey missed the bigger point and “underestimated the benefits of cloud computing.” Moorman presents 3 different characteristics of the cloud than McKinsey does, and frankly, my own definition is more slanted to his than to McKinsey’s. Both have warrant however. Moorman correctly points out how McKinsey blurs IaaS and PaaS into one. This is something that I will discuss here shortly. I definitely recommend reading the non-”back-of-the-envelope” analysis. I do know who pays $14,000 for a Windows server over 3 years, the Military. All joking aside, there are the OpEx costs of keeping said server up and running, something McKinsey does touch upon, but not thoroughly enough. I will (as I said) leave the TCO/TCA analysis to the experts, though. Lastly, Moorman appropriate says that time (and money) is better spent in leaving the infrastructure to experts, letting companies focus on their core competencies.
While I cannot divulge specifics, it is important to note here that at GoGrid, our own utilization rate far exceeds the “best possible” ones listed by McKinsey.
My Shots from the Peanut Gallery
McKinsey is a well thought-of firm and I have full respect for their research, analysis and findings, but it’s always important to get in other perspectives when confronted with potentially deceiving findings. So let’s dive right in.
“Irrational Exuberance” and Unrealistic Expectations
Granted, McKinsey tips their hat to the cloud having “great potential” but goes on to attempt to stifle what many consider to be one of the most important technology shifts in recent years. True, clouds came out of nowhere, creating what many believe to be a marketing storm. Everyone jumped on the bandwagon, trying to define (and be the source of the ultimate definition) of Cloud Computing. We dashed under the rain of buzzwords last year, trying to make sense of the madness. Page 10 of their study shows a list of 22 cloud computing definitions (mine is listed there). As with anything new, people need to be able to put their heads around it, smell it, taste it, touch it. When you talk about something virtual, this is increasingly difficult to do. Thus, the buzzword and definition madness ensued. (We, at GoGrid, even created a home-grown video to help people understand it and make it tangible. It has over 43,000 views as of this writing.)
My issue with the statement of over exuberance and unreal expectations is, what other way could it have happened? Anything that is hot now moves much more quickly than we had been used to. Take Twitter as a clear cut example. It’s mainstream. There are strategies and business being built around it. The Social Networking movement could be equated to that of the Cloud if you think about it. In my mind, this is no different. Both are exciting and new, full of promise, but wrought with growing pains, naysayers and disbelievers. So, would McKinsey say the same thing about Social Media?
And where on the Gartner “Hype Cycle” are Twitter or Cloud Computing? I dare say that with Twitter, we are on the Slope of Enlightenment. I believe the Trough of Disillusionment was passed a while ago and mainstream adoption is taking place. But the curves are somewhat distorted due to the speed of which Social Media has spread. For the Cloud, I’m not sure. I almost feel the Trough has been leapt to some extent due to the rapidity of the movement as well, and we are on the way to “seeing the light.” But I’m a forward thinker. (These two topics are great subjects for later discussion.)
Their Definition of a Cloud is Blurred
What I find fascinating is that despite the “hundreds” of definitions of Cloud Computing in the blogosphere, McKinsey still couldn’t nail it down, and, in the process took a shortcut in their definition. For starters, there is no mention of “self-service” within McKinsey’s definitions. This is fundamental to the cloud and a key differentiator between traditional infrastructure deployments or virtualization, for that matter. As Randy Bias pointed out to me, “more important than embracing virtualization in the enterprise is building ‘private clouds’ that have self-service models.” This is an important topic to consider. If an enterprise works a virtualization strategy, will it be done in a way where it is entirely self-service in nature? Private clouds (using virtualization) will take time to get to where public clouds already are.
I actually somewhat agree with their three cloud characteristics of: 1) “hardware management is highly abstracted from the buyer,” 2) “buyers incur infrastructure costs as variable OPEX”, and 3) “infrastructure capacity is highly elastic (up or down)”. One should be careful, however with point #3, especially with mixing capacity and elasticity with scalability. I believe it would have been important for McKinsey to discuss elastic capacity component a bit further than “capacity can be scaled up or down dynamically, and immediately, which differentiates from traditional hosting service providers.” This is a somewhat limiting statement, and hosting vendors like ServePath/GoGrid are already proving it wrong (see “Cloud Connect“).
Generally, there are two types of scalabilities: vertical and horizontal. Briefly, vertical scaling represents adding more to the “boxes” you have (e.g., RAM or Storage): build “up” by putting more horsepower into what you already manage. Horizontal scaling means that you add more infrastructure (e.g., servers) side by side, essentially building “out.” This is much the way that Google sets up their shop, by adding more boxes horizontally to increase compute capacity. For a good primer on scalability, I encourage you to read Randy Bias’ Whitepaper “Scaling Your Internet Business.”
Categorization of Clouds is Blurred as well
With a broad stroke of the sword (or pen), McKinsey chopped the Cloud in half: “Clouds” and “Cloud Services”. This is the satellite view of things, not even a 10,000 foot view. By focusing the lens a bit more (as well as reading what others had written), one would know that there are 3 distinct “layers” to Cloud Computing, even more if you fine-tune the focus. I’ve articulated this before (and it is even in their presentation under the multiple definitions) in the form of the “Cloud Pyramid.”
You simply cannot lump IaaS and PaaS together. They are different beasts. Sure they share some commonalities in that they fit the broad definition of the cloud. (At GoGrid, we currently define Cloud Computing as: “on-demand self-service Internet infrastructure where you pay-as-you-go and use-only what you need, managed by your browser or application.“).
But the differences between IaaS and PaaS are important and cannot be simply brushed aside. As I wrote previously (“Navigating the Layers of the Cloud Computing Pyramid“) and I simplify here, Platform Clouds (like Google App Engine and Microsoft Azure) have the OS and Frameworks managed by the provider whereas Infrastructure Clouds (like GoGrid and Amazon Web Services) boil things down further to the hardware and networking protocols, for example. Even within IaaS there are clear distinctions (Cloudcenter vs Infrastructure Web Services – read about how we differentiate these terms here and here). So why does McKinsey overlook this? I’m not entirely sure. I won’t attempt to draw any conclusions either.
The fundamental point here is, you cannot simply split the cloud in half (Clouds vs. Cloud Services), at a minimum it must be in thirds (Application, Platform & Infrastructure). Finally, I don’t quite understand why Cloud Services (loosely defined by McKinsey as “SaaS”) shouldn’t have the same characteristics of Cloud Computing in general. True, they don’t incur “infrastructure costs”, but they are subject to recurring costs through licensing, monthly usage or seats. And McKinsey is correct that SaaS can be built on top of other layers of the Cloud (Pyramid). In fact, the same could be said for Platforms being built on top of Infrastructure Clouds. It may not be the most cost-effective method to build a Cloud Platform, but it is definitely viable.
Service Level Agreements are NOT a Hindrance nor a Crutch
As Gartner’s Lydia Leong points out, there is a difference between engineering for reliability and actually attaining it. According to Leong, “most enterprise data centers have mathematical uptimes below 99.99% (i.e., calculated mean time between failure)”. To take this a bit further, SLA’s definitely should be viewed as a requirement for any company doing due diligence when choosing a hosting provider, cloud or not.
The cloud is obviously under an enhanced level of scrutiny due to its newness and promises. When any Cloud hiccups, whether it’s SalesForce, Google App Engine or GoGrid, people hear about it and instantly say “See? The Cloud can’t maintain 5 – nines of uptime!” and suddenly the cloud is evil. But failures happen at all levels of infrastructure and technology and it is not unique to the cloud. Having a provider that offers a solid SLA is critical (e.g., GoGrid offers one of the most robust SLAs in the hosting industry). But, if one were to follow McKinsey’s recommendation and “virtualize everything internally,” how will SLA’s be honored there? There are no clear standards in the hosting industry that dictate how uptime is calculated. It is frequently left up to the vendor to interpret, and subsequently back up through an SLA. This is increasingly obscure when it comes to private datacenters. And what are the consequences of not attaining 5-9′s in a private datacenter? No more “Casual Fridays”?
The cost to attain an incredible uptime record is high. It is inherently tied to a vendor’s reputation and can dramatically affect sales and customer stickiness. But any smart IT person knows that failures happen and an even smarter IT person strategically plans for it through a better architecting of their solution. SLA’s help (if using a hosting vendor), but they will NOT solve architectural mistakes; doing it all in-house under a vaguely defined SLA or uptime standard can be cost and time prohibitive.
Disputed Cost Savings
I personally would love to have $20 million sitting around to set up my own datacenter. I’m sure that there are many SME’s who would love to have that liquidity and cash availability as well. In this day and age, being able to plop down even a fraction of that amount to create or re-architect a datacenter is practically a pipe-dream. Again, if you virtualize your offering, you may only realize partial success. Also, as Leong points out (see yellow highlights), there is much more value to be gained through using the cloud than attempting to DIY (do it yourself), AND, it depends on the size of your organization. Larger corporations have efficiencies of scale (hopefully) as compared to a SMB. Obviously, when re-engineering an IT strategy, one has to pay careful attention to what is gained and lost through keeping the status quo, re-architecting through internal virtualization or moving to the cloud.
There are some definite uses of public clouds for businesses to capitalize on now, namely:
- Small Business – when hiring a full fledged Ops team or hand-building your infrastructure is cost prohibitive (convert CapEx to OpEx)
- Medium Business – when you need to dynamically control costs as well as compute capacity. Elastic clouds can do this for you.
- Large/Huge Business – outsourcing is important to non-critical infrastructure like QA/Dev/Skunk works or as a failover strategy
My stance on this is, the most effective strategy is one that is hybrid in nature. Corporations should look to see what must be moved to the cloud, what must stay in-house and what falls into the grey area. Even within the cloud, there are scenarios that simply are better handled through integrating dedicated or colocated infrastructure (and the benefits therein) with a cloud front-end (see GoGrid’s Cloud Connect). Planning for “up” and “out” scalability using the elasticity of the cloud, coupled with the performance of dedicated hardware is a must when considering strategies.
And what about “innovation” as Golden pointed out? Cloud providers can clearly innovate much more rapidly than an in-house IT team can, especially given the virtualization, budget, time and monetary constraints when doing internally.
Lastly, John Keagy, CEO/Co-Founder of GoGrid, poses this question: “What is the opportunity cost of spending $20MM on a non-core activity like a datacenter?” Realizing this capital investment into a profitable business unit is potentially attainable by few, impossible by most.
The F.U.D. Factor
Honestly, I thought we were past this phase but I guess not. As with anything new, there are those who are quick to point out that the cloud is not the panacea of all things. I agree with that somewhat. However, when McKinsey draws a comparison to the 2000 dot-com bubble, I get a bit aggravated.
McKinsey truly brings out the Fear, Uncertainty and Doubt factor into play with this analogy. The three bullet points (paraphrased to: huge investments made based on hype, inability to generate profits, and NASDAQ losing 80% of value) simply do not come into play here. During the dot-com bubble, money was thrown around loosely and given to anyone who had even a half-baked business plan. Money was not wisely spent and business had little to show during and thereafter. In this day and age, before any funding is given, companies not only must have a fully vetted business, they also have to have an installed user base and definitely signs of profitability. Lets not forget about the “transparency” buzzword du jour. And this time around, the stock market drop has little or nothing to do with the technology sector.
To equate the cloud movement to the dot-com bubble is naive at best. Try to tell Amazon that what they have been doing for a couple of years now is just a fad and will not last. Need I say more?
Perhaps I had too much coffee while I wrote this. Or maybe woke up on the wrong side of the bed. One part of me is truly happy that McKinsey came out drawing some clear lines in the sand. Their “non-partisan” analysis is definitely required to help put things in perspective for companies evaluating their IT strategies. The other part of me, however, wishes that they hadn’t been so pessimistic and potentially misleading.
If you haven’t read their study, I encourage you to do so. Read and analyze it and give me your interpretation.
Latest posts by Michael Sheehan (see all)
- James Gosling to Speak on Innovation at GoGrid Cloud Meetup on 5/22 - May 16, 2013
- Advertising in the Cloud - May 2, 2013
- How To Enable & Manage the New, Free GoGrid Firewall Service - May 1, 2013