We're Hiring!  
Toll Free US & Canada: 1(877) 946-4743   Worldwide: +1(415) 869-7444

Archive for the ‘Storage’ Category

Web Applications like WordPress, Drupal, Joomla, SugarCRM and others are all the rage and have been for quite a while. The huge availability of Open Source applications, typically based on Linux, Apache, mySQL and PHP (LAMP stacks) that you can find in SourceForge or other repositories, makes the implementation of powerful web-based solutions a snap. Once you find the web application of your dreams, the next step is finding a hosting provider. There are many VPS (Virtual Private Server) hosting providers that offer shared hosting at pennies on the dollar. But with those VPS solutions, you are left with exactly that, a “shared” environment. So, if someone else on your shared server is running bad scripts or code that sucks up resources on your server, you are affected with little or no recourse to resolve other than to complain, moan or move to a different provider.

So, as you grow (or as your service deteriorates due to the resource-sucking of others on your shared box), you are left with a decision of what to do next. Many people choose the most obvious upgrade path of leasing a dedicated server (e.g., at ServePath, we offer dedicated, managed hosting) or colocating (where you bring your own hardware and a hosting provider like Coloserve leases space, power, cooling, security and bandwidth). But now, you have another option that truly fits the model of delivering scalable web hosting…put in in the Cloud, with GoGrid, for example.

Recently I helped map out the implementation of a secure, redundant, load-balanced web application in the Cloud using GoGrid.

Original Setup

A client originally set up the following implementation of a WordPress blog on GoGrid:

  • One Web/Application Server
    • CentOS 32-bit
    • 2 GB RAM
    • LAMP stack
    • Web Application – WordPress 2.7
  • One DB Server
    • CentOS 64-bit
    • 2 GB RAM
    • mySQL 5.0

scalable_web_app_orginal_gg

There were some immediate concerns that we noticed after evaluating the original design:

  • No ability to gracefully scale with traffic demands
  • There was a Public IP connection to DB Server
  • No “root” password set on DB Server
  • No DB backups
  • No Web Server redundancy

While this setup is not all that bad (at least in comparison to a VPS solution), it is not ideal. After we put our heads together, we came up with a much more elegant solution that uses many of the advantages of the Cloud, but also capitalizing on standard high-availability network infrastructures as available on GoGrid.

Optimized Setup

Ideally, when you set up a redundant, high-availability network, you want to eliminate various points of failure as well as optimize the infrastructure to handle demand yet be scalable as well. With traditional or Do-It-Yourself hosting, people frequently either over-buy or under-buy their infrastructure which either wastes times, money and/or resources or cripples you when you are successful and can’t keep up with demand. I recommend that you read Randy Bias’ Whitepaper called “Scaling Your Internet Business” for some scalability insight.

So, what we wanted to achieve with our re-design was something much more secure and reliable. What we came up with is not necessarily a de facto standard, but rather a recommendation and a “how-to-implement” guide. You can obviously do more (e.g., more servers) or take shortcuts. It’s really your call. Here is what we did as a “better” solution.

  • Two Web/Application Servers
    • CentOS 64-bit
    • 1 GB RAM
    • LAMP Stack
    • Web Application – WordPress 2.7
  • One DB Server
    • CentOS 64-bit
    • 2 GB RAM
    • mySQL 5.0
  • GoGrid Cloud Storage
  • F5 Load Balancing

scalable_web_app_new_gg

That was the list of GoGrid infrastructure components we ended up using. But HOW and WHY we used these is the important thing to document here. Once again, a quick laundry list:

  • Why Load-Balance? GoGrid offers free F5 Load Balancing. It is very important for a high-availability, redundant web application infrastructure to use a load balancer. It can, based on the configuration, equally spread traffic between all of your web-servers, as well as automatically fail over traffic to a different server should one go down, without downtime or interruption to service.
  • Why use Cloud Storage?There are many reasons why to use Cloud Storage in this type of environment:
    • GoGrid provides 10 GB of free Cloud Storage which can be mounted by your Web, Application and/or DB servers. Free is always nice!
    • Cloud Storage has high-throughput via gigabit connections, redundancy from daily backups and is infinitely scalable. Cloud Storage was GREAT for our solution, because we configured the WordPress environment to actually look at a “symlink”((symbolic-link, like an alias or pointer to another file or directory or location)) on the Web Server that connects to the physical files residing on Cloud Storage.
    • Web Application Clustering – one of the biggest issues with running hosted PHP Web applications (for example), is that you have to keep the PHP files locally on a server. This makes setting up a “clustered” environment much more difficult as you need to update multiple files on multiple servers should you have to change or update code. While you could set up some sort of complex “rsync” session, this gets complicated and potentially confusing. By pointing all of the Web/App Servers to Cloud Storage, the source files (PHP files) are at a single point and much more easy to manage.
    • It can act as a repository for back-ups from DB servers and/or Web servers
  • Why reduce the amount of RAM on the Web/App Servers? Most of the processing power is needed at the database level and not necessarily at the Web/Application level. So, by splitting the RAM out from one 2GB server into two servers with 1GB of RAM each, there is no direct cost impact, yet you gain redundancy. Also, since 10GB of data on Cloud Storage is free, you do not need to use the persistent storage available on the Web/App servers (note: more RAM = larger persistent file storage).
  • Why multiple Servers? First, please see earlier point re: load balancer. GoGrid uses an algorithm to determine which node new Servers are instantiated, ensuring that servers are distributed over different nodes. If a GoGrid node encounters an issue and potentially rendering a server as inaccessible, the load balancer will automatically route traffic to other available servers on different nodes. Since new server instantiations are automatically created on different nodes, the built-in GoGrid redundancy enables high-availability.

So, HOW did we do it all? We have published a very detailed Wiki article titled “How to Set Up a Load Balanced and Redundant Web Application on GoGrid” which goes through the following items (note: the steps listed in the wiki article assume that you have fairly good familiarity with Linux and system administration therein):

There are obvious permutations that can be implemented on top of this design that we have outlined. You could add more Web/App Servers, connect via Cloud Connect to a dedicated or colocated environment, or change the backup timing and strategies, for example.

Also, with the design we wrote, there still remains one important single point-of-failure, that of the MySQL database. Within the next few weeks, we will be compiling a “How To” on setting up a MySQL Replication environment within GoGrid in a Master-Master configuration, using many of the same methodologies outlined within the GoGrid Wiki article. However, assuming that you have fairly strong knowledge of restoring MySQL databases from MySQL dump backups, you can quickly create a new MySQL server and restore data from a previous backup, should you happen across a server failure.

How are you using GoGrid for your Web Applications? Do you have a particular infrastructure implementation that you are proud of? I want to know!


Measuring the Performance of Clouds – GoGrid

Written by on Mar 17th, 2009 | Filed under: Cloud Computing, GoGrid, Storage
6,249 views

Raditha Dissanayake posted a blog entry comparing Amazon EC2 and GoGrid performance. Unfortunately, we think Raditha did not use the most rigorous methodology possible for doing his comparison. It would be inappropriate for GoGrid to performance test Amazon’s EC2. In fact, their Customer Agreement may actually make such activity questionable, but IANAL (I Am Not A Lawyer).

Let’s take a more rigorous look at GoGrid disk subsystem performance.

Framing the Issue

As a start the entire issue is a LOT more complex than can potentially be covered here. Today’s disks, hard drive controllers, and operating systems have many different kinds of caching mechanisms. In addition, virtualization systems like Xen can impact results in unexpected ways. For example, did you know that Xen can be deployed in two major manners?

Either ‘paravirtualized’ or ‘hardware virtualized’. The two different models almost certainly impact any testing methodology. And yes, you guessed it, Amazon and GoGrid don’t configure Xen in the same way. Amazon uses paravirtualization and GoGrid uses hardware virtualization. Beyond this public information neither Amazon nor GoGrid provide significant details about their infrastructure considering it, rightfully so, proprietary intellectual property.

Without a deep understanding of all of the issues it’s difficult to do a test much less a proper comparison.

But we are certain of a few very important things.

Clouds Are Multi-Tenant

First off, it’s hard to do a serious comparison like this using one server on each system. Clouds are inherently multi-tenant systems and since end users have no visibility into who else is using or sharing their disk resources at any given time there is no real way to verify that the results aren’t tainted by other activity.

Use the Right Tool

Secondly, hdparm -t isn’t a very good way to measure disk speed. It’s susceptible to noise from background activity, in fact the man page says:

-t Perform timings of device reads for benchmark and comparison purposes. For meaningful results, this operation should be repeated 2-3 times on an otherwise inactive system (no other active processes) with at least a couple of megabytes of free memory. [...]

As you can see in Raditha’s test, hdparm doesn’t really do enough I/O to get consistent results in a multi-tenant environment. In the tests, hdparm is only active for a very short period of time allowing tenancy to have a dramatic effect on the results.  hdparm requires an inactive system and since that can’t be guaranteed in the cloud it fails the sniff test for a robust tool for cloud performance testing.

Another factor here that is unaccounted for is that hdparm is a utility tuned for real physical disks, not virtual disks.

Better Measurements

Ideally if you want to measure the streaming performance of a block device in a more reliable way in a multi-tenant environment, then use a larger amount of I/O. When doing this I/O you want to try to eliminate:

  • Hard disk controller layer cache effects
  • Hard disk layer cache effects
  • OS level cache effects
  • Effects of disk activity from other VMs

All current GoGrid nodes have caches in the storage layer. These are designed to be robust and to absorb burst of write activity. These caches are sufficiently large though that if you do repetitive small I/Os what you end up measuring in the performance in pulling this data out of the storage layers caches, not from the storage itself.

To avoid OS level cache effects use ‘direct I/O’. High performance applications and databases tend to use this internally for similar reasons (because they want to avoid OS level cache pollution and do their own caching). Oracle is probably the most obvious example here.

Testing Performance

On a ‘small VM’ located on a fairly busy node:

[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 3.50983 seconds, 299 MB/s
[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 3.06811 seconds, 342 MB/s
[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 2.14147 seconds, 490 MB/s

That’s using enough I/O to minimize noise from other VM activity and large enough to avoid hitting cache effects.

If the I/O load is small enough you can hit storage layer cache effects:

[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=10
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.116491 seconds, 900 MB/s
[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=10
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.16058 seconds, 653 MB/s
[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=10
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.115701 seconds, 906 MB/s

While this is a fairly contrived example, it’s useful in other ways because it shows you can get very good burst throughput (consider a database updating a few thousand pages).

A larger memory instance (where average performance should be a lot better).

Sustained (large) IO:

[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=100 of=/dev/null iflag=direct
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 1.80415 seconds, 581 MB/s
[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=100 of=/dev/null iflag=direct
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 1.70448 seconds, 615 MB/s
[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=100 of=/dev/null iflag=direct
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 1.6799 seconds, 624 MB/s

Burst (small) IO:

[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=10 of=/dev/null iflag=direct
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.105183 seconds, 997 MB/s
[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=10 of=/dev/null iflag=direct
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.089827 seconds, 1.2 GB/s
[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=10 of=/dev/null iflag=direct
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.090264 seconds, 1.2 GB/s

Don’t take my word for any of this. Try it out. If you’re really bored graph I/O performance vs I/O size and you’ll likely see a step function with a soft edge that will give you some idea of what the storage system is capable of and the degree of I/O variation.

Bottom Line

It’s great that people are kicking the tires of various clouds, but let’s be careful to make sure our testing is rigorous and makes sense for the environment.  If you have questions about how to measure performance on clouds, please send them to us.  Or if you’re a performance and virtualization system guru and have some knowledge to share, please do so.

We always want to improve our cloud and take seriously any feedback that shows a real problem, but in this case the test needs tweaking, not GoGrid.


By now, many in the Cloud Computing space have heard about (or even read) the University of California Electrical Engineering & Computer Science’s (EECS) study on Cloud Computing titled: “Above the Clouds: A Berkeley View of Cloud Computing.” Published on February 10th, 2009, the EECS’s paper provides a seemingly academic study of the Cloud Computing movement, attempts to explain what Cloud Computing is all about, and identifies potential opportunities as well as challenges present within the market.

The 20+ page study is authored by Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy H. Katz, Andrew Konwinski, Gunho Lee, David A. Patterson, Ariel Rabkin, Ion Stoica and Matei Zaharia who all work in RAD Lab. (Interestingly, several of the companies mentioned within the study are also Founding Sponsors and/or affiliate members: Sun, Google, Microsoft, Amazon Web Services, etc.).

There has already been plenty of discussion and analysis of this study (by James Urquhart, Krishna Sankar and has even appeared on Slashdot.org). Needless to say, I felt compelled to get my two cents in, especially from the perspective of a Cloud Computing Infrastructure vendor.

EECS_banner

From an academic standpoint, this document definitely has some legs. It is complete with carefully thought out scenarios, examples and even formulae, as well as graphs and tables. Some of the points that are brought up even got me scratching my head (e.g., using flash memory to help by “adding another relatively fast layer to the classic memory hierarchy”). Even the case analysis of a DDoS attack from a cost perspective of those initiating an attack to those warding off an attack on a Cloud was interesting to ponder. I commend these group of authors on undertaking such a grand task of not only writing by committee but also overlaying a very business school vs. mathematics and computer sciences approach to the writing and analysis.

Unfortunately, however, as I read through the document, I started scrawling madly in the margins with commentary that is somewhat contrary to what was written within the study.

A Few Comments from the “Peanut Gallery”

I don’t want my article to come off as a complete rebuttal to what is written in this study. Quite the contrary. I’m encouraged that one group within the academic community has taken considerable time and effort analyzing and writing about the Cloud. What appears below is a small “laundry list” of things that need to be called out and is a mixture of positive and negative comments:

  • EECS’s Cloud Computing definition – “Cloud Computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the datacenters that provide those services. The services themselves have long been referred to as Software as a Service (SaaS), so we use that term. The datacenter hardware and software is what we will call a Cloud.[1]
    My comments: I personally found this definition to be incomplete and potentially misleading. While the EECS is correct in including SaaS (Cloud Applications) as a subset of Cloud Computing, they have (consciously?) lumped everything else into a catch-all phrase of “hardware and system software.” For people to truly understand Cloud Computing, I feel that it is important to become much more granular in defining the layers of the Cloud (Cloud Applications, Cloud Platforms and Cloud Infrastructure – the “Cloud Pyramid”, a term I coined last year). I actually found it interesting that the group of authors couldn’t agree what the precise differences between the “X as a Service” were.[2] In order for all of the assumptions and conclusions to take place, I would have thought that clearly defining what the “Cloud” is would be paramount to the success of the findings.
  • 3 Important Technical Aspects of the Cloud – the group outlines three items of the Cloud: 1) “infinite computing resources” 2) “elimination of an up-front commitment” and 3) “pay for use of computing resources on a short-term basis as needed.”[3]
    My comments: For the most part, I agree with these statements. However, #3 is a bit skewed towards an Amazon EC2 model. At GoGrid, we are pioneering the idea of a “cloudcenter” (a datacenter in the Cloud) which presents a different paradigm. EC2 has long been touted as being a way for quick batch processing where instances are spun up, consumed and then discarded. This falls within the third aspect that is defined above. However, when you take the view of creating a “datacenter in the cloud,” there is less of a “quick use function” and more of a scalable infrastructure notion designed to replace traditional datacenters and associate infrastructures.
  • New Application Opportunities – several new or emerging opportunities designed to capitalize on the benefits of the Cloud are outlined: “mobile interactive applications,” “parallel batch processing,” “the rise of analytics, extension of compute-intensive desktop applications,” and “‘earthbound’ applications.”[4]
    My comments: I’m actually glad to see these so carefully explained as they do cover many aspects that are potentially “unique” to the Cloud: dynamic storage, dynamic availability, scalable processing and compute power, and cost-effectiveness to name a few.
  • Classes of Utility Computing – Amazon’s EC2 is at one end of the spectrum and Google AppEngine and Force.com is at “the other extreme” with Microsoft Azure falling somewhere in the middle. Also, “virtualized resources” are broken up into 3 classes: Computation, Storage and Networking[5]
    My comments: For starters, since the group was unable to fully define the Cloud “spectrum,” it’s difficult to understand how they place EC2 at one end and having the spectrum “end” at Cloud Platforms (e.g., Force.com or AppEngine). The “full” spectrum must include SaaS as well as PaaS and IaaS in order to fully encompass the definition. Gmail and SalesForce exemplify SaaS and definitely should be contained within the Cloud mantra. Microsoft Azure, Force.com and Google AppEngine are truly Cloud Platform. Perhaps within the Platform layer, Azure and AppEngine are far between, they do, however, occupy the same Cloud space of “here is a development environment, you must work within it” (e.g., Python, .NET). Cloud Applications are simply “here is a web-based software application that is available for consumption and you have minimal flexibility in terms of controlling it.” Lastly, Cloud Infrastructure works as “enjoy full control over your infrastructure despite the fact that it is a bit more challenging to control.” For the most part, the 3 virtualized resources do fall within what is outlined. Storage can be expanded to include “Cloud Storage” (dynamic), “Persistent Storage” (traditional) and “Volatile or Temporary Storage” (typically associated with EC2 instances where storage disappears when the EC2 instance is destroyed or goes down).

I could probably nitpick through some other items, but I will leave that up to you.

The Cloud Pyramid

Comments from a Cloud Vendor perspective

In Section 7 of the study, the EECS group presents “10 Obstacles and Opportunities for Cloud Computing” which definitely should be addressed. For this section, I’m putting on my “GoGrid Green” colored glasses and presenting points and counter-points to each of the 10 items outlined. Again, this is not intended to come off as a ping-pong match, but rather a commentary and opportunity for dialog. I encourage you to read this section prior to reviewing my responses. I have tried to briefly paraphrase each item (but that probably doesn’t do it justice).

  1. Availability of a Service – “will Utility Computing services have adequate availability”[6]
    My Response: The study outlines outages specific to the Cloud, citing S3, AppEngine and Gmail in particular. I have said this before, outages happen and they are not unique to the Cloud. Natural and human-caused disasters occur. Hurricanes and cable cuts can affect all sorts of infrastructure. As with a traditional datacenter, in-house or outsourced, traditional or in the Cloud, a disaster failover and redundancy strategy should be part of an IT department’s general strategy for success or just survival. One thing to consider is mirroring or creating redundancy on different types of infrastructures: if your primary is in the Cloud, have a dedicated failover; if your colo is on the East Coast, think about something on the West. Also look beyond simply the service and review the Support organization, the Service Level Agreement (SLA) and the provider’s expertise within the field. GoGrid, for example, has 24×7 Free support, the most robust SLA of any Cloud provider and over 9 years of hosting experience and expertise.
  2. Data Lock-in – “the API’s for Cloud Computing itself are still essentially proprietary”[7]
    My Response: Unfortunately it seems that GoGrid’s announcement back in January of this year where we discussed how our GoGrid cloudcenter API has been put under a Creative Commons Sharealike license was somehow overlooked when compiling facts for this study. Our idea behind this move is to start working standards from the ground up. GoGrid is also an active participant in many of the interoperability meetings around the country. Part of the reason why we released our API to the community at large is to demonstrate our commitment to open standards. We also have modeled the GoGrid cloudcenter extremely closely to a traditional datacenter where all of your hardware, protocols and connectivity is familiar. This helps lessen the “lock-in” scenario and avoids the use of proprietary API’s and other components. Also mentioned is “surge computing” which is another term for “cloud bursting” or “hybrid” clouds. Our Cloud Connect offering works exactly in this way, where users can opt to have high-end, large I/O databases, for example, reside within a traditional, managed hosting environment (through ServePath, our parent company). Cloud Connect allows for scalable and dynamic web front-ends, hosted in the GoGrid Cloud, to connect via a dedicate private network to higher-end servers in a managed hosting back-end.
  3. Data Confidentiality and Auditability – “current cloud offerings are essentially public (rather than private) networks, exposing the system to more attacks”[8]
    My Response: The statement above is rather alarmist in nature. I agree that many efforts should be made to ensure the resiliency and security of the Cloud, and these efforts are well underway at GoGrid as well as other Cloud providers. Again, however, this is not something completely unique to the Cloud. Any hosting provider or datacenter (or cloudcenter for that matter) must ensure that security and the integrity of the network and infrastructure is maintained at a high standard. GoGrid, for example, is SAS70 Type II audited and certified. The EECS’s statement, however, is not a completely honest assessment. Public vs. Private datacenters, dedicated hosting or clouds are very different. The concerns of publically hosted infrastructures are really no different whether in the cloud or in a datacenter; they will both be inherently a bit more vulnerable. However, I would say that companies whose business it is to solely do hosting will potentially have more robust security protection and attack prevention measures in place than a self-hosted or even private cloud would. In terms of HIPAA compliance or Sarbanes-Oxley, there are stringent requirements of data protection, privacy and isolation. While it may be difficult to pass accreditation for these types of compliances “in the cloud”, using a feature like Cloud Connect, for example, allows for compliance to take place on a dedicated, warehoused set of servers within a traditional datacenter, something much more palpable and acceptable.
  4. Data Transfer Bottlenecks – “applications continue to become more data-intensive”[9]
    My Response: It’s all about the data, I agree. The Cloud is an ideal environment for statistical analysis and number crunching. I personally know of one GoGrid user who would spin up multiple instances of GoGrid servers, upload a huge amount of data, run some analysis programs and then export the resulting summaries, all in a matter of hours and only costing a few dollars. The arguments presented by the EECS group are true; until we get the ability to transfer large amounts of data through very big pipes at a extremely lost cost, this could be a barrier for those customers who may be considering the Cloud as a data eating machine. However, when we at GoGrid designed our business model, we kept scenarios like this in mind and came up with an easy solution: make all inbound data transfers free. This way, GoGrid users can upload large amounts of data to their cloudcenter, move that data around within the private network therein, put some on Cloud Storage should they desire, analyze to their hearts content and then download the summary or result sets (typically much smaller in file size than the data going in). GoGrid does charge for outbound but you can see how the pricing model works to the user’s advantage in analysis scenarios.
  5. Performance Unpredictability – “multiple Virtual Machines can share CPUs and main memory surprisingly well in Cloud Computing, but that I/O sharing is more problematic”[10]
    My Response: This is a very good point and difficult to fully refute. It’s true that CPU and RAM can be virtualized, managed and isolated extremely well. Disk I/O performance can suffer at times. Again, this is part of the reason we offer a solution for this with Cloud Connect (see previous statements). It is frequently better to offload extremely intensive I/O processes to a dedicated environment, at least until virtualization technology gets more aligned with bare-metal performance. We even released a “custom patch” for 64-bit Linux users on GoGrid that helps increase disk drive performance. While some may says that this is a bit non-standard, it does show our understanding of this concern and marks an effort to resolve or minimize the impact.
  6. Scalable Storage – short-term usage, no up-front cost and infinite capacity on-demand doesn’t apply to persistent storage[11]
    My Response: I have to agree somewhat to this idea, however it is a bit of an oxymoron. Persistent storage requires that it is dedicated in some way, available at all times and easily usable. On EC2, for example, if your instance dies, you lose any persistence of data, which is part of the reason why they recommend using S3 (their Cloud Storage offering). This is logical from so many standpoints: redundancy & share-ability are two that immediately jump to mind. Again, at GoGrid we took a slightly different approach by making all GoGrid Cloud servers have persistent storage available from the beginning. The amount of persistent storage is directly tied to the amount of RAM you have allocated: if you choose a higher RAM instance, you get more persistent storage. However, I don’t see scalable storage to be an obstacle entirely. Amazon offers S3 and GoGrid has a similar Cloud Storage offering. Both are scalable on demand, billed by usage and usable by Cloud Servers. GoGrid’s Cloud Storage is mountable as a drive and shareable among a user’s GoGrid servers within the GoGrid infrastructure using industry standard protocols (e.g., SAMBA, CIFS, RSYNC & SCP). To that end, in my mind it does meet the 3 properties outlined with the omission of the “persistent” adjective.
  7. Bugs in Large-Scale Distributed Systems – “one of the difficult challenges in Cloud Computing is removing errors in these very large scale distributed systems”[12]
    My Response: This is actually one obstacle that I fully agree with. Often it is difficult to “mirror” physical, large scale computing environments within the Cloud. Unfortunately, it is not an apples-to-apples comparison. One simply cannot just “port” a physical, complex infrastructure over to the Cloud. If you do, you will fail. You need to architect your Cloud environment capitalizing on the efficiencies and features of the Cloud. Otherwise, you simply translate (and potentially compound) issues existing previously further. Another thing to consider is that all Virtualization or Hypervisor technologies have bugs, as with any software for that matter. The complexity of a Cloud environment is multi-fold: at the hypervisor and management layer, the hardware layer of the grid or utility architecture, as well as within the VM’s themselves. This is a complicated and delicate environment. The good news is, because this is technology that is around to stay, and is consistently being built upon, refined and improved, the end results are only improvements. Important to this again is interoperability and standards, similar to the Wild West becoming civilized and engineered. Bugs will be squashed and efficiencies gained through increased R&D efforts as well as customer adoption and validation.
  8. Scaling Quickly – “automatically scale quickly up and down in response to load in order to save money, but without violating SLAs”[13]
    My Response:  This is one of the key value propositions of Cloud Computing. You must be able to scale up and down based on demand (or even based on a budget). Much of this can be done using API’s or companies like RightScale. As I mentioned previously, Design for the Cloud. Traditionally, companies over-bought their infrastructure, saving it all for a rainy day. At ServePath, we know for a fact that CPU, RAM and Storage on our dedicated machines are only hitting about a 5% utilization on average. Many companies have built up their infrastructure for the “what if” scenarios. These inefficiencies are part of the reason why Cloud Computing has become so popular, a panacea of sorts. When you design for the cloud, you must ensure that your strategy capitalizes on scalability, both up and down, but also on redundancy and persistence. Of course, it all depends on the type of system you are architecting (persistent – a store-front or content driven marketplace, or temporary – data analysis, bulk processing).
  9. Reputation Fate Sharing – “reputations do not virtualize well”[14]
    My Response: I feel that this fully depends on how a Cloud provider crafts their offering. The example given in the EECS study is that of blacklisted EC2 IP addresses due to spamming. This is a valid concern but is due to how AWS releases their public IP address back “into the pool” once an instance is removed or destroyed. At GoGrid, we took a different approach. For starters, all users are assigned a contiguous block of static public IP addresses. When a GoGrid user deletes a server, that public IP address is released back into THEIR pool and not a general pool. Thus, if an IP address gets flagged by a spam-prevention service as being “bad,” the “bad reputation” is contained within a particular GoGrid user’s environment and not the entire GoGrid user base. Similarly, by default, we block all outbound SMTP traffic by default. Users who wish to use this protocol must request this block be lifted. Also, while somewhat inconvenient, this one-time action does help to maintain a positive reputation for a vendor as a whole. Be sure to carefully review a vendor’s SLA, Terms of Service (TOS), Privacy Policy and Acceptable Use Policy (AUP).
  10. Software Licensing – “licensing models for commercial software is not a good match to Utility Computing” & “pay-as-you-go seems incompatible with the quarterly sales tracking”[15]
    My Response: Software licensing models are being forced to evolve to be able to handle the on-demand nature of the Cloud. While Amazon took the approach of increasing the hourly charge to handle licensing of Windows Server vs. an open-source alternative, GoGrid, in order to maintain simplicity, rolled it all into one (no difference between Red Hat, CentOS or Windows). Licensing of Microsoft SQL Server on GoGrid, for example, is handled through a monthly (not hourly) charge. This helps with both a customers budget projections as well as from our own sales projections. Simplicity in explanation and execution is critical. If your user is confused as to how the billing works or how to project what charges they will incur, they will not execute. Token billing, tied to hourly charges will also become increasingly prevalent.

Summing it all up

If you made it both through the EECS group’s study as well as this blog post, I truly commend you, and you hopefully have a better understanding of the Cloud Computing term and properties therein, especially from the standpoint of an academic institution and Cloud Computing vendor. While I have challenged a few of the statements made within the study, there are others that stand up just fine. The important overall idea here is that serious brainpower and resources are being thrown at the Cloud, from understanding and analysis standpoint to development and execution therein.

A special message to the EECS group: I would personally like to invite you all cross the Bay (from Berkeley to San Francisco) to come and visit a Cloud Computing provider who is already overcoming the obstacles you have outlined. We would love to have a round-table discussion about the Cloud and help you with the next version of this study.

  1. M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. Feb 10, 2009. “Above the Clouds: A Berkeley View of Cloud Computing.” Electrical Engineering and Computer Sciences. University of California at Berkeley. http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.html p. 4 []
  2. ibid. p. 4 []
  3. ibid. p. 4 []
  4. ibid. pp. 7-8 []
  5. ibid. pp. 8-9 []
  6. ibid. pp. 14-15 []
  7. ibid. p. 15 []
  8. ibid. pp. 15-16 []
  9. ibid. pp. 16-17 []
  10. ibid. pp. 17-18 []
  11. ibid. p. 18 []
  12. ibid. p. 18 []
  13. ibid. p. 18 []
  14. ibid. p. 18 []
  15. ibid. p. 19 []

GoGrid_win2k8_4GB_ram Last week, we quietly released some new larger GoGrid Cloud server instances. Today we are making that announcement a bit louder. What does this mean to you? Well, your GoGrid cloudcenter just got a bit broader and more powerful. For a year now, we have been offering 0.5, 1 and 2 Gigabyte RAM options in both Windows and Linux, now we have 4 and 8 GB RAM instances available. These larger instances, available on all 64-bit operating systems, allow for new types of higher-end environments to be spun up using all of the characteristics of Cloud Computing.

The lower size RAM instances (0.5, 1 & 2 GB) are perfect for a web front-end, where either Apache or IIS are running. For extremely high-performance and high I/O instances, we have been offering Cloud Connect as a way to create a dedicated hybrid infrastructure where Cloud Web Servers running on GoGrid can be linked via private dedicated network connections to dedicated and managed servers within the ServePath network.

With the new 4 and 8 GB RAM options, you can now set up a infrastructure with a robust set of high-performance application servers within the Cloud. These types of high RAM instances are perfect for users who want to take advantage of the increased RAM, CPU cores and persistent storage, especially when used in conjunction with specific applications (e.g., Microsoft SQL server or other Enterprise applications) that require more larger amounts of resources like RAM or CPU.

The 4 GB RAM server images can be deployed via the GoGrid web portal and API. The 8 GB RAM server images currently may only be deployed via the GoGrid API. I recommend reading the API section of the GoGrid wiki in order to fully understand how to deploy 8 GB RAM instances.

The 4 and 8 GB RAM images, available for Red Hat Enterprise Linux 5.1, CentOS 5.1, and Windows Server 2003 and Windows Server 2008 64-bit operating systems bring a new level of performance to the GoGrid line. 4 GB Cloud Servers have 3 CPU Cores and 8 GB have 6 CPU Cores, ensuring dedicated CPU allocations and high performance.

All GoGrid Cloud Servers come with persistent storage. The new larger RAM allocations announced today, are delivered with increased persistent storage: 4 GB Cloud Servers have 240 GB of hard drive space and 8 GB have 480 GB of storage allocated at boot time. Additional storage can be added using GoGrid’s dynamically scalable Cloud Storage offering which includes a 10 GB free allotment to start with. Each 1 GB thereafter costs $0.15/GB/month.

Our current breakdown of GoGrid Cloud Servers and associated RAM/CPU/Persistent Storage is as follows:

Server RAM CPU Cores Core Burst Persistent Storage
512 MB 1/8 1 30 GB
1 GB 1/4 1 60 GB
2 GB 1/2 1 120 GB
4 GB 3 3 240 GB
8 GB 6 6 480 GB

Further information on the new 4 and 8 GB RAM GoGrid Cloud Servers can be found on the GoGrid site. Server Release Information on these new images can be found on the GoGrid wiki. We have also posted a Server Compatibility Matrix that graphically shows what server instances are available with the associated RAM allocations.

If you are a Windows user, we ask that you please our Release and Errata pages as there are some known issues specific to 8 GB Windows Servers which may require a workaround and that they should be aware of before using 8 GB GoGrid Servers with Windows.

Our full Press Release on this information can be viewed here as well as on the GoGrid site.

As always, please leave any questions or comments here on this blog post, or open a Support case via the GoGrid portal should you need technical assistance.


overcast_podcast Last week, Randy Bias, VP of Technology Strategy and I participated in a podcast on Cloud Computing called “Overcast: Conversations on Cloud Computing“, hosted by James Urquhart and Geva Perry. The Overcast podcast series discusses various aspects of the Cloud Computing Industry and related technologies. Previous guests included Lew Tucker (Sun Microsystems), Greg Ness (Infoblox) and John Willis (a leading cloud computing blogger), among others. The podcast, “Overcast Show#6: Feb 5, 2009 – with Randy Bias and Michael Sheehan, GoGrid” is a little less than an hour in length and covers many of the following topics:

  • Distinction and clarifications around the terms “Cloudcenter” and “Infrastructure Web Services” as they existing within the Cloud Infrastructure layer. (More reading on cloudcenters can be found here and here.)
  • Understanding GoGrid’s approach to standards and interoperability, especially as they relate to datacenter and infrastructure standards
  • Platform-as-a-Service (PaaS) providers such as Google App Engine and how Cloud Infrastructure (Infrastructure-as-a-Service) and GoGrid fits in
  • Discussion around how we recently put our GoGrid API under a Creative Commons license as well as our efforts to involve other cloud providers and vendors, such as Flexiscale, RightScale and Eucalyptus, in building open standards from the ground up (more info here)
  • How GoGrid is working with Puppet and Chef technologies to automate system administration and configuration management
  • Using GoGrid’s Cloud Connect offering to “cloudburst” and create hybrid infrastructure topologies using the dynamic scalability of Cloud Web Servers and the robust, high I/O throughput of dedicated backend servers
  • …and much more…

We encourage you to listen to this podcast to gain some insight on our thought leadership, concepts and ideas around Cloud Computing, GoGrid and the hosting industry in general. This (and all) podcasts are available in a variety of formats:

  • Download Overcast Podcast #6 as an MP3 File
  • Subscribe to Overcast in iTunes (Note: this link will attempt to launch iTunes.)
  • Play from this site (click on the graphic below)

    Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

As always, feel free to leave any questions or comments you may have on this blog or on the Overcast post.