We're Hiring!  
Toll Free US & Canada: 1(877) 946-4743   Worldwide: +1(415) 869-7444

Archive for March, 2009

Measuring the Performance of Clouds – GoGrid

Written by on Mar 17th, 2009 | Filed under: Cloud Computing, GoGrid, Storage
6,249 views

Raditha Dissanayake posted a blog entry comparing Amazon EC2 and GoGrid performance. Unfortunately, we think Raditha did not use the most rigorous methodology possible for doing his comparison. It would be inappropriate for GoGrid to performance test Amazon’s EC2. In fact, their Customer Agreement may actually make such activity questionable, but IANAL (I Am Not A Lawyer).

Let’s take a more rigorous look at GoGrid disk subsystem performance.

Framing the Issue

As a start the entire issue is a LOT more complex than can potentially be covered here. Today’s disks, hard drive controllers, and operating systems have many different kinds of caching mechanisms. In addition, virtualization systems like Xen can impact results in unexpected ways. For example, did you know that Xen can be deployed in two major manners?

Either ‘paravirtualized’ or ‘hardware virtualized’. The two different models almost certainly impact any testing methodology. And yes, you guessed it, Amazon and GoGrid don’t configure Xen in the same way. Amazon uses paravirtualization and GoGrid uses hardware virtualization. Beyond this public information neither Amazon nor GoGrid provide significant details about their infrastructure considering it, rightfully so, proprietary intellectual property.

Without a deep understanding of all of the issues it’s difficult to do a test much less a proper comparison.

But we are certain of a few very important things.

Clouds Are Multi-Tenant

First off, it’s hard to do a serious comparison like this using one server on each system. Clouds are inherently multi-tenant systems and since end users have no visibility into who else is using or sharing their disk resources at any given time there is no real way to verify that the results aren’t tainted by other activity.

Use the Right Tool

Secondly, hdparm -t isn’t a very good way to measure disk speed. It’s susceptible to noise from background activity, in fact the man page says:

-t Perform timings of device reads for benchmark and comparison purposes. For meaningful results, this operation should be repeated 2-3 times on an otherwise inactive system (no other active processes) with at least a couple of megabytes of free memory. [...]

As you can see in Raditha’s test, hdparm doesn’t really do enough I/O to get consistent results in a multi-tenant environment. In the tests, hdparm is only active for a very short period of time allowing tenancy to have a dramatic effect on the results.  hdparm requires an inactive system and since that can’t be guaranteed in the cloud it fails the sniff test for a robust tool for cloud performance testing.

Another factor here that is unaccounted for is that hdparm is a utility tuned for real physical disks, not virtual disks.

Better Measurements

Ideally if you want to measure the streaming performance of a block device in a more reliable way in a multi-tenant environment, then use a larger amount of I/O. When doing this I/O you want to try to eliminate:

  • Hard disk controller layer cache effects
  • Hard disk layer cache effects
  • OS level cache effects
  • Effects of disk activity from other VMs

All current GoGrid nodes have caches in the storage layer. These are designed to be robust and to absorb burst of write activity. These caches are sufficiently large though that if you do repetitive small I/Os what you end up measuring in the performance in pulling this data out of the storage layers caches, not from the storage itself.

To avoid OS level cache effects use ‘direct I/O’. High performance applications and databases tend to use this internally for similar reasons (because they want to avoid OS level cache pollution and do their own caching). Oracle is probably the most obvious example here.

Testing Performance

On a ‘small VM’ located on a fairly busy node:

[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 3.50983 seconds, 299 MB/s
[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 3.06811 seconds, 342 MB/s
[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 2.14147 seconds, 490 MB/s

That’s using enough I/O to minimize noise from other VM activity and large enough to avoid hitting cache effects.

If the I/O load is small enough you can hit storage layer cache effects:

[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=10
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.116491 seconds, 900 MB/s
[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=10
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.16058 seconds, 653 MB/s
[root@foo ~]# dd if=/dev/hda bs=10M of=/dev/null iflag=direct count=10
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.115701 seconds, 906 MB/s

While this is a fairly contrived example, it’s useful in other ways because it shows you can get very good burst throughput (consider a database updating a few thousand pages).

A larger memory instance (where average performance should be a lot better).

Sustained (large) IO:

[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=100 of=/dev/null iflag=direct
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 1.80415 seconds, 581 MB/s
[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=100 of=/dev/null iflag=direct
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 1.70448 seconds, 615 MB/s
[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=100 of=/dev/null iflag=direct
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 1.6799 seconds, 624 MB/s

Burst (small) IO:

[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=10 of=/dev/null iflag=direct
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.105183 seconds, 997 MB/s
[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=10 of=/dev/null iflag=direct
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.089827 seconds, 1.2 GB/s
[root@ubdev1 ~]# dd if=/dev/hda bs=10M count=10 of=/dev/null iflag=direct
10+0 records in
10+0 records out
104857600 bytes (105 MB) copied, 0.090264 seconds, 1.2 GB/s

Don’t take my word for any of this. Try it out. If you’re really bored graph I/O performance vs I/O size and you’ll likely see a step function with a soft edge that will give you some idea of what the storage system is capable of and the degree of I/O variation.

Bottom Line

It’s great that people are kicking the tires of various clouds, but let’s be careful to make sure our testing is rigorous and makes sense for the environment.  If you have questions about how to measure performance on clouds, please send them to us.  Or if you’re a performance and virtualization system guru and have some knowledge to share, please do so.

We always want to improve our cloud and take seriously any feedback that shows a real problem, but in this case the test needs tweaking, not GoGrid.


f5_devcentralLast week I had the pleasure of joining Peter Silva (Technical Marketing Manager at f5) and Telemachus Luu (Director of Business Strategy at GoGrid/ServePath) in a podcast hosted by f5 on their DevCentral community site. The topic of the podcast was “Cloud Computing” (of course) but specifically how using f5 technology, ServePath and GoGrid were able to create a full spectrum of hosting solutions ranging from Dedicated and Managed Hosting (ServePath) and Colocation hosting(ColoServe), up and into the Clouds with GoGrid.

The podcast titled “Hosting in the Cloud with ServePath and F5” covers a variety of topics including:

  • ServePath’s product extension from managed hosting to cloud hosting with GoGrid
  • The “Cloud Pyramid” and distinctions within the various Cloud layers
  • Understanding the nuances within the Cloud Infrastructure layer: “Infrastructure Web Services” & “Cloudcenters”
  • How f5 was paramount in creating a Cloud Computing Infrastructure offering

I encourage you to listen to this 30 minute podcast (forgive the audio quality, we were in an empty conference room) which is available at the following locations:

  • On f5′s DevCentral site
  • As a downloadable MP3 file
  • Play from this site (click on the graphic below)

    Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

If you have any questions about the items discussed within this podcast, please feel free to leave a comment to this post.


Today we released a new Whitepaper written by Randy Bias, GoGrid’s VP of Technology Strategy titled “Scaling Your Internet Business.” If you are a Web Application Developer or interested in learning about scalability, specific to how it relates to Web Applications in or outside of the Cloud, I encourage you to give this whitepaper a read.

The whitepaper can be downloaded here from the GoGrid site.

Scalability is critical to the success of many organizations currently involved in doing business on the Web or who are providing information that may suddenly become heavily demanded. While there are many strategies that IT organizations can undertake, the way they are designed and implemented can make or break these businesses.

The GoGrid whitepaper discusses the following topics:

  • How web applications scale
  • Cloud Computing and scalability therein
  • Thinking through and choosing a scaling strategy
  • GoGrid & ServePath scalability options

Scalability can come in all shapes, sizes and flavors. You can scale “up” (vertically) or “out” (horizontally). Choosing the right option can be tricky, if not daunting. Depending on what you want your strategy to be, you can choose “cloud-only”, “dedicated/colocated-only” or a “hybrid” approach.

figure1

A Cloud-only environment.

figure2-3

A “Hybrid” environment using Cloud Connect.

figure4

A “Hybrid” environment using Dedicated and Colocated servers in conjunction with a Cloud front-end using Cloud Connect.

“Businesses need more than just cloud computing to solve their scalability problems, says the whitepaper author Randy Bias. “Web operators and developers want to use the best tool for the job and, right now, cloud computing is one tool in their arsenal. GoGrid has pioneered the concept of cloudcenters, datacenters-in-the-cloud, which provide the full range of scalability tools needed for a growing business including cloud servers, managed dedicated database servers, private VLANs, VPNs, and even co-location for those who need their own hardware. This whitepaper describes how a growing business can use vertical and horizontal scaling techniques to the most advantage to save money and never miss a prospect, customer, reader or interaction.”

Companies interested in learning about Web Application Scalability, Cloud Infrastructure, hybrid hosting and scaling solutions available from GoGrid or ServePath are encouraged to download this whitepaper from either the GoGrid site or ServePath site.