Archive for March, 2012


Speeding Things Up in the Cloud with NGINX

Monday, March 26th, 2012 by

clip_image001 clip_image002 clip_image004

It’s been no secret to us in the high-performance, web server in-crowd that NGINX (pronounced “engine-x”) has been taking the webhosting world by storm for the last several years; *sites like WordPress, Facebook, Hulu, Github, SourceForge and more have been offloading some or many functions onto NGINX. I had originally been exposed to NGINX whilst researching for a higher-performance web server that was 64-bit friendlier than Apache, and that was did not use single threads. Apache has an enormous memory footprint on 64-bit systems and is a single-threaded application.

NGINX is a very flexible HTTP server that can also serve as a reverse proxy, load balancer, caching server, and an IMAP/POP3 proxy. Unlike Apache, however, the configuration is a little bit more involved and can be a big change for Apache loyalists.

In this is example, NGINX will be configured as a full webserver with PHP support. My goal when conjuring this project was to make a pre-configured Community GSI on the GoGrid Exchange with as little modification as possible to ensure a “pure” environment. If you’re anything like me, you might tremble at the thought of even using a typical, pre-configured server with a LAMP stack; I personally like setting things up from scratch, but there’ve been plenty of situations where I would’ve preferred a pre-configured solution. Hopefully I can capture the essence of my intentions.

One thing I should note before I get started is that NGINX does not have a module for PHP the way Apache does; PHP must be run using the FastCGI methodology. Much like the way you would pass requests to a Java container or reverse proxy, so must we for PHP.

The first thing I should mention is that I’m using the EPEL and IUS repositories to for the latest versions of NGINX and PHP-FPM. IUS is the official repository for RHEL/CentOS as referenced by Using these 2 repositories will not alter any existing packages on your system.

(more…) «Speeding Things Up in the Cloud with NGINX»

Industry Recognition: 2012 OnDemand 100 Top Private Company & CRN Coolest Cloud Infrastructure Vendor

Friday, March 23rd, 2012 by

It’s always difficult to be self-congratulatory, but when two distinct organizations post articles putting GoGrid in the limelight, I feel that they have to be mentioned. The two sites, CRN and OnDemand, both posted articles that we, as a company, are proud of. If you are a current customer of GoGrid, you should be proud as well, as it is a recognition of your selection process in making GoGrid your cloud partner.


The first to appear was on CRN, a website providing news, analysis and perspectives for VARs and technology integrators. In his article, author Andrew Hickey, lists out “The 20 Coolest Cloud Infrastructure Vendors” of which GoGrid is one of the 20.


As Hickey describes:

If you want to be in the cloud business, these are some of the cloud infrastructure companies that will help you get there. These are the providers that will host your customers’ business applications and provision them on-demand as Software-as-a-Service. They will store customers’ data in the cloud and secure it there as well. Whether customers want to use a private cloud, a public cloud or a hybrid mixture of both, these companies can help make it happen. And they can even help your customers exchange their expensive legacy hardware for a simple monthly payment plan.

(more…) «Industry Recognition: 2012 OnDemand 100 Top Private Company & CRN Coolest Cloud Infrastructure Vendor»

The Big Data Revolution – Part 2 – Enter the Cloud

Wednesday, March 21st, 2012 by

In Part 1 of this Big Data series, I provided a background on the origins of Big Data.

But What is Big Data?

Port Vell Barcelona

The problem with using the term “Big Data” is that it’s used in a lot of different ways. One definition is that Big Data is any data set that is too large for on-hand data management tools. According to Martin Wattenberg, a scientist at IBM, “The real yardstick … is how it [Big Data] compares with a natural human limit, like the sum total of all the words that you’ll hear in your lifetime.” Collecting that data is a solvable problem, but making sense of it, (particularly in real time), is the challenge that technology tries to solve. This new type of technology is often listed under the title of “NoSQL” and includes distributed databases that are a departure from relational databases like Oracle and MySQL. These are systems that are specifically designed to be able to parallelize compute, distribute data, and create fault tolerance on a large cluster of servers. Some examples of NoSQL projects and software are: Hadoop, Cassandra, MongoDB, Riak and Membase.

The techniques vary, but there is a definite distinction between SQL relational databases and their NoSQL brethren. Most notably, NoSQL systems share the following characteristics:

  • Do not use SQL as their primary query language
  • May not require fixed table schemas
  • May not give full ACID guarantees (Atomicity, Consistency, Isolation, Durability)
  • Scale horizontally

Because of the lack of ACID, NoSQL is used when performance and real-time results are more important than consistency. For example, if a company wants to update their website in real time based on an analysis of the behaviors of a particular user interaction with the site, they will most likely turn to NoSQL to solve this use case.

However, this does not mean that relational databases are going away. In fact, it is likely that in larger implementations, NoSQL and SQL will function together. Just as NoSQL was designed to solve a particular use case, so do relational databases solve theirs. Relational databases excel at organizing structured data and is the standard for serving up ad-hoc analytics and business intelligence reporting. In fact, Apache Hadoop even has a separate project called Sqoop that is designed to link Hadoop with structured data stores. Most likely, those who implement NoSQL will maintain their relational databases for legacy systems and for reporting off of their NosQL clusters.

(more…) «The Big Data Revolution – Part 2 – Enter the Cloud»

The Big Data Revolution – Part 1 – The Origins

Tuesday, March 20th, 2012 by


For many years, companies collected data from various sources that often found its way to relational databases like Oracle and MySQL. However, the rise of the internet and Web 2.0, and recently social media began not only an enormous increase in the amount of data created, but also in the type of data. No longer was data relegated to types that easily fit into standard data fields – it now came in the form of photos, geographic information, chats, Twitter feeds and emails. The age of Big Data is upon us.

A study by IDC titled “The Digital Universe Decade” projects a 45-fold increase in annual data by 2020. In 2010, the amount of digital information was 1.2 zettabytes. 1 zettabyte equals 1 trillion gigabytes. To put that in perspective, the equivalent of 1.2 zettabytes is a full-length episode of “24” running continuously for 125 million years, according to IDC. That’s a lot of data. More importantly, this data has to go somewhere, and this report projects that by 2020, more than 1/3 of all digital information created annually will either live in or pass through the cloud. With all this data being created, the challenge will be to collect, store, and analyze what it all means.

Business intelligence (BI) systems have always had to deal with large data sets. Typically the strategy was to pull in “atomic” -level data at the lowest level of granularity, then aggregate the information to a consumable format for end users. In fact, it was preferable to have a lot of data since you could also “drill-down” from the aggregation layer to get at the more detailed information, as needed.

Large Data Sets and Sampling

Coming from a data background, I find that dealing with large data sets is both a blessing and a curse. One product that I managed analyzed share of wireless numbers. The number of wireless subscribers in 2011 according to CTIA was 322.9 million and growing. While that doesn’t seem like a lot of data at first, if each wireless number was a unique identifier, there could be any number of activities associated with each number. Therefore the amount of information generated from each number could be extensive, especially as the key element was seeing changes over time. For example, after 2003, mobile subscribers in the United States were able to port their numbers from one carrier to another. This is of great importance to market research since a shift from one carrier to another would indicate churn and also impact the market share of carriers in that Metropolitan Statistical Area (MSA).

Given that it would take a significant amount of resources to poll every household in the United States, market researchers often employ a technique called sampling. This is a statistical technique where a panel that represents the population is used to represent the activity of the overall population that you want to measure. This is a sound scientific technique if done correctly but its not without its perils. For example, it’s often possible to get +/- 1% error at 95% confidence for a large population but what happens once you start drilling down into more specific demographics and geographies? The risk is not only having enough sample (you can’t just have one subscriber represent the activity of a large group for example) but also ensuring that it is representative (is the subscriber that you are measuring representative of the population that you want to measure?). It’s a classic problem of using panelists that sampling errors do occur. It’s fairly difficult to be completely certain that your sample is representative unless you’ve actually measured the entire population already (using it as a baseline) but if you’ve already done that, why bother sampling?

(more…) «The Big Data Revolution – Part 1 – The Origins»

Welcome the Newest Inductee to the “Old School” Club – Life at GoGrid

Friday, March 16th, 2012 by

I recently celebrated my 5-year anniversary with GoGrid! That’s me holding the service award we hand out to employees who celebrate 3 and 5-year anniversaries. It was 5 years ago that I implemented the idea of handing out these gems, gems that GoGrid employees proudly display at their desks.


I remember my first interview with, then, ServePath, and interviewing in a very small office down the street from our current corporate office. The office was so small my interview was conducted in the break area/lobby surrounded by cubes. 5 years later GoGrid’s corporate office occupies the second floor in the Hills Plaza complex that overlooks the Bay Bridge. Talk about an upgrade!


Image source:

My first tour of our San Francisco Data Center was amazing. Back then the data center shared the floor with cubes and offices, housing our small engineering and support teams. Now the data center is so big, with only servers and more servers lining pretty much every inch of our space, that I get lost on the occasional tours I give to my vendors and employees.

Although I have witnessed many changes over the last 5 years what hasn’t changed are the people that work for GoGrid. We are still that group of people I saw on the Craigslist posting for my job 5 years ago but better. We still work hard and like to have fun. This is what makes GoGrid a great place to work and why there are quite a few of us who have been here for a very long time.

So now that I have been inducted to the unofficial (still working on internal marketing) “Old School” club I join the ranks of 14% of the company who have been here for 5 years or more and another 12% that will hit their 5 years sometime this year! This means as an Old School member I get one week to Take a Break this year- that’s GoGrid’s way of saying thanks for your continued years of service and that’s in addition to the 25 other days I get for PTO. Thoughts on where I should vacation would be appreciated!

(more…) «Welcome the Newest Inductee to the “Old School” Club – Life at GoGrid»