We're Hiring!  
Toll Free US & Canada: 1(877) 946-4743   Worldwide: +1(415) 869-7444

Archive for the ‘General’ Category

When you purchase a car, you obviously think a lot about its performance before you buy. How much horsepower does it have? Is the car safe? How does it handle? Is the gas mileage going to break the bank or will you be saving the environment? Is the vehicle flexible enough to meet all your needs or just suitable for one activity like off-roading?

performance-cars-sm

When you think about cloud computing, specifically cloud infrastructure, performance matters as well. And there are many factors to consider when shopping for a cloud provider or partner. How’s their VM performance? Does their network provide multiple high-bandwidth pipes to support your network-hungry application or service? Are there any I/O bottlenecks? If your website comes under heavy load, can you burst to support it and then scale back when demand subsides?

These are important considerations. Would you want a car that has no acceleration when getting on a highway? Probably not. That’s the same reason you wouldn’t want a cloud that is over-subscribed or doesn’t have the architecture to support your business needs.

Speaking of “performance,” this is our third year sponsoring the Under the Radar (UTR) conference and marks the second year that our CMO, Jeffrey Samuels, being a judge there. This year, Jeff will be on the panel for the Performance Monitoring session. Here are the companies presenting in this session:

  • Fabric Engine – lets developers write high-performance application using dynamic languages
  • Iron.io – provides elastic products for cloud messaging and background processing
  • Sumo Logic – gives real-time Big Data and IT insights of log intelligence and analytics
  • Tracelytics – provides insights into the performance of web applications

Tuning your application or infrastructure for the best performance possible is a critical check-box when moving into production. And monitoring how everything performs once it’s released is another. One quick and easy way to get a jump-start on these two items is to choose a cloud provider with a reputation as a solid performer. As the Technology Evangelist of GoGrid, I’m particularly proud of our performance over the past years. We’ve been independently benchmarked as providing market-leading I/O performance and also lead the pack in uptime from an SLA standpoint. Our network availability and performance remain unparalleled. And we craft unique infrastructure solutions to ensure performance is there when you need it, whether it’s with a completely public cloud solution, a hybrid infrastructure (mixture of cloud and physical servers), or a private cloud instance. And our recently announced Big Data solution couples performance with the scalability that developers, analytics firms, advertisers, and social media companies are demanding.

I look forward to seeing the innovations companies are presenting at UTR this year, and can’t wait to meet many of you personally. Best of luck to all the presenters. May you truly “perform!”


Earlier this month, we had had the distinguished pleasure of celebrating GoGrid’s international expansion to EMEA by hosting a party with Equinix in Amsterdam. Europe has been more than hospitable to our new European team, and this event was no exception.

This was my 4th visit to our Amsterdam HQ and data center and I’m truly excited to see the traction within the European community and the adoption by those European companies seeing value in multiple global points of presences.

The location of the event was at a converted auto shop which is now a cool new restaurant and lounge, it still had some stylish remnants of its past – including an actual Ferrari in the center of the room! This venue turned out to be the ideal location for our welcoming party.

clip_image002

The turnout to the event was great. GoGrid received a warm welcome from over 100 different partners and customers as well as some of the top international companies who also have expanded to Europe. Having a casual atmosphere in which to discuss technology services, the importance and adoption of cloud computing within Europe and the direction it is going globally definitely set the stage for some powerful conversations.

But, it wouldn’t have been a party if we didn’t have a guest speaker. Comedian, Greg Shapiro, was perfect for the event. Shapiro is also from the US but now makes a living in the Netherlands. His antics and insights on transitioning cultures made for a fun and enjoyable evening. Greg Shapiro is top notch and we highly recommend him to anybody looking for a sharp speaker who understands both a US and European audience.

clip_image004

With two large TV screens displaying pictures and words to illustrate the differences in cultures and languages, Shapiro kept the audience engaged while balancing cultural and language nuances.

It’s important for me to see the results of our hard work over the past year developing not only an EMEA HQ but also firmly establishing a cloud data center presence. As our global reach continues to expand, I hope to meet more of our overseas customers and get to know how they are utilizing our cloud services.


clip_image001 clip_image002 clip_image004

It’s been no secret to us in the high-performance, web server in-crowd that NGINX (pronounced “engine-x”) has been taking the webhosting world by storm for the last several years; *sites like WordPress, Facebook, Hulu, Github, SourceForge and more have been offloading some or many functions onto NGINX. I had originally been exposed to NGINX whilst researching for a higher-performance web server that was 64-bit friendlier than Apache, and that was did not use single threads. Apache has an enormous memory footprint on 64-bit systems and is a single-threaded application.

NGINX is a very flexible HTTP server that can also serve as a reverse proxy, load balancer, caching server, and an IMAP/POP3 proxy. Unlike Apache, however, the configuration is a little bit more involved and can be a big change for Apache loyalists.

In this is example, NGINX will be configured as a full webserver with PHP support. My goal when conjuring this project was to make a pre-configured Community GSI on the GoGrid Exchange with as little modification as possible to ensure a “pure” environment. If you’re anything like me, you might tremble at the thought of even using a typical, pre-configured server with a LAMP stack; I personally like setting things up from scratch, but there’ve been plenty of situations where I would’ve preferred a pre-configured solution. Hopefully I can capture the essence of my intentions.

One thing I should note before I get started is that NGINX does not have a module for PHP the way Apache does; PHP must be run using the FastCGI methodology. Much like the way you would pass requests to a Java container or reverse proxy, so must we for PHP.

The first thing I should mention is that I’m using the EPEL and IUS repositories to for the latest versions of NGINX and PHP-FPM. IUS is the official repository for RHEL/CentOS as referenced by PHP.net. Using these 2 repositories will not alter any existing packages on your system.

I added the repositories by installing their RPM’s.

[root@00581-1-1042302 ~] # rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-5.noarch.rpm
Retrieving http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-5.noarch.rpm
Preparing... ########################################### [100%]
[root@00581-1-1042302 ~] # rpm –Uvh http://dl.iuscommunity.org/pub/ius/stable/Redhat/6/x86_64/epel-release-6-5.noarch.rpm
Retrieving http://dl.iuscommunity.org/pub/ius/stable/Redhat/6/x86_64/epel-release-6-5.noarch.rpm
Preparing... ########################################### [100%]
# ls /etc/yum.repos.d/
CentOS-Base.repo epel.repo ius-dev.repo
CentOS-Debuginfo.repo epel-testing.repo ius.repo
CentOS-Media.repo ius-archive.repo ius-testing.repo

(Note: for CentOS-Base.repo “epel.repo”, is enabled by default & “ius-dev.repo” is disabled by default; for CentOS-Debuginfo.repo, “epel-testing.repo” is disabled by default & ius.repo is enabled by default; for CentOS-Media.repo, both “ius-archive.repo” & “ius-testing.repo” are disabled by default.)

Now let’s examine the /etc/nginx/nginx.conf configuration file. I’ve purposefully only included the changes that I made which are mostly based around commonsense performance and a Debian-like folder structure for easily enabling and disabling active sites.

Any change that I’ve made in PHP-FPM or NGINX has “GoGrid” appended to the line, as indicated by my grep below.

[root@00581-1-1042302 ~]# grep GoGrid /etc/nginx/nginx.conf
worker_processes 1; #GoGrid - Change to number of CPU threads/cores.
keepalive_timeout 5; #GoGrid
gzip on; #GoGrid
gzip_proxied any; #GoGrid
gzip_comp_level 1; #GoGrid
gzip_buffers 16 8k; #GoGrid
gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript application/javascript; #GoGrid
include /etc/nginx/sites-enabled/*; #GoGrid

worker_processes – The number of nginx processes that will run at one time. This should be equal to the amount of CPU cores your server has. Note that each process can handle potentially handle thousands of worker_connections.

worker_connections – Each worker is setup to handle 1024 connections (although not seen above).

gzip_on – gzip is commonly turned on to decrease throughput and speed up transaction times.

Include /etc/nginx/sites/enabled*; – Debian-style folder structure where one creates a symlink from sites-enables to the real config file in sites-available.

[root@00581-1-1042302 ~]# ls -l /etc/nginx/sites-enabled/example.com

lrwxrwxrwx 1 root root 30 Mar 16 03:44 /etc/nginx/sites-enabled/example.com -> ../sites-available/example.com

Now that we’ve covered the NGINX configuration file, let’s take a look at the /etc/nginx/fastcgi_params file. This file is supposed to be shared between all website configuration files to provide a common environment. Rarely have I needed to modify anything in this file besides what is mentioned below. Again, I’m only including what I’ve changed or added.

#GoGrid - The following line is commented out to avoid duplicate filenames in the URI with most common configurations. #fastcgi_param SCRIPT_FILENAME $request_filename; #GoGrid #GoGrid - Commonly Added Options #fastcgi_connect_timeout 30; #fastcgi_send_timeout 30; #fastcgi_read_timeout 30; #fastcgi_buffer_size 128k; #fastcgi_buffers 4 256k; #fastcgi_busy_buffers_size 256k; #fastcgi_temp_file_write_size 256k; #fastcgi_intercept_errors on;

fastcgi_param – Since this is already defined in most individual website configurations (with the corresponding full path to the document root) on nginx, I comment this out, as it can lead to duplicate php URI’s, like www.example.com/phpinfo.php/phpinfo.php

GoGrid – Commonly Added Options – These are performance options that many examples are modeled after. You can find these settings in blogs and nginx sample config files alike. In particular, setting the timeouts is important to sever realistically, timed-out connections. I would recommend un-commenting these out and using them, but I left them disabled in order to stick to my simplicity theme.

The individual website configuration in a vhost-style file/directory is incredibly standard. We’ve got logging, server_name, document root, directory index, etc. Anyone who’s worked with webservers should be relatively comfortable.

server {
listen XXX.XXX.XXX.XXX:80 backlog=1024;
server_name www.example.com;
autoindex off;
access_log /var/www/vhosts/example.com/log/nginx_access.log;
error_log /var/www/vhosts/example.com/log/nginx_error.log;

location / {
root /var/www/vhosts/example.com/html;
index index.php index.html;
}

location ~ \.php$ {
fastcgi_pass unix:/tmp/php-fpm.socket;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /var/www/vhosts/example.com/html$fastcgi_script_name; # same path as above
include fastcgi_params;
}
}

#server {
# listen XXX.XXX.XXX.XXX:443 ssl;
# ssl_certificate /usr/local/certs/www.example.com.nginx.pem;
# ssl_certificate_key /usr/local/certs/www.example.com.nginx.key;
# server_name www.example.com;
# autoindex off;
# access_log off;
# error_log /var/www/vhosts/example.com/log/nginx_ssl_error.log;
#
# location / {
# root /var/www/vhosts/example.com/html;
#index index.php index.html;
#
#}
#
#location ~ \.php$ {
# fastcgi_pass unix:/tmp/php-fpm.socket;
# fastcgi_index index.php;
# fastcgi_param SCRIPT_FILENAME /var/www/vhosts/example.com/html$fastcgi_script_name; # same path as above
# include fastcgi_params;
#}
#}

Last, but not least, is the www.conf pool file in /etc/php-fpm.d/ folder. NGINX and PHP-FPM have their own performance settings, which can be a bit change of pace from straight Apache and mod_php.

As usual, only the changed I’ve made have been shown below.

[root@00581-1-1042302 ~]# grep GoGrid www.conf
listen = /tmp/php-fpm.socket; GoGrid
user = www-data; GoGrid
group = www-data; GoGrid
pm.max_children = 500; GoGrid
pm.start_servers = 50; GoGrid
pm.min_spare_servers = 50; GoGrid
pm.max_spare_servers = 350; GoGrid
pm.max_requests = 500; GoGrid
; slowlog = /var/log/php-fpm/www-slow.log; GoGrid

listen – I always change the listen address from a TCP socket to a unix socket. For me this is just about having a clean netstat reading. Most Sys Admins likely keep this on a TCP for a myriad of reasons. To switch this back to a TCP socket, go into the config file and uncomment the line directly above it and comment this setting out. Don’t forget to update the fastcgi_pass option in the website configuration file.

user/group – User www-data created with a homedir of /var/www to segregate nginx and Apache users.

pm.setting – Common settings similar to Apache’s. See here for more information.

slowlog – Turned off, as this is more of a debugging feature.

Although we’ve gone over a lot of details about the NGINX and PHP-FPM setup, the nice thing about this image is that it’s immediately ready to use without any modification!

Simply add a server through the GoGrid portal and search for nginx in the search bar.

addserver

nginxsearch

After you’ve deployed this MyGSI image, simply edit your website configuration file with your server’s IP, and put the IP address of the server in the browser and you should be greeted with a php info page. The firewall and update-rc.d profiles are already set up.

Now get out there, experiment, and let me know how it goes!


It’s always difficult to be self-congratulatory, but when two distinct organizations post articles putting GoGrid in the limelight, I feel that they have to be mentioned. The two sites, CRN and OnDemand, both posted articles that we, as a company, are proud of. If you are a current customer of GoGrid, you should be proud as well, as it is a recognition of your selection process in making GoGrid your cloud partner.

cloud_100_infrastructure_400

The first to appear was on CRN, a website providing news, analysis and perspectives for VARs and technology integrators. In his article, author Andrew Hickey, lists out “The 20 Coolest Cloud Infrastructure Vendors” of which GoGrid is one of the 20.

CRN-GoGrid-coolest-vendor-2012

As Hickey describes:

If you want to be in the cloud business, these are some of the cloud infrastructure companies that will help you get there. These are the providers that will host your customers’ business applications and provision them on-demand as Software-as-a-Service. They will store customers’ data in the cloud and secure it there as well. Whether customers want to use a private cloud, a public cloud or a hybrid mixture of both, these companies can help make it happen. And they can even help your customers exchange their expensive legacy hardware for a simple monthly payment plan.

(The others mentioned are, alphabetically: Amazon Web Services, AT&T, Bluelock, Cisco, Dell, Eucalyptus Systems, Gale Technologies, GoGrid, HP, Nebula, NephoScale, OpenStack, Opscode, OpSource, Rackspace, Savvis, SoftLayer, SunGard Availability Services, Terremark, and Verizon. Congrats to the other companies in the list!)

ODTOP100.TopArtCollage_300x211_use_me_0

The second article of recognition comes from AlwaysOn (“Networking the Global Silicon Valley”) OnDemand. OnDemand is an event “where top Internet companies disrupt the enterprise and square off with the incumbent players pioneering cloud computing and SaaS.” This month, they posted their “2012 OnDemand 100 Top Private Companies“.

OnDemand-gogrid-2012-priv-company

The editors describe this list as such:

This year’s 100 top, private on-demand and SaaS companies-plus 20 to watch-are creating a complex world of interconnected business intelligence, merging valuable legacy data and systems with new, vital streams of information. AlwaysOn is proud introduce the third annual OnDemand 100-the top emerging Internet companies disrupting the established enterprise, reinventing legacy data streams, and pioneering cloud computing and SaaS.

This year’s OnDemand 100 companies are true leaders, developing game-changing approaches and technologies that are pushing outside the bounds of existing markets and away from entrenched institutions. Companies were selected based on a set of five criteria: innovation, market potential, commercialization, stakeholder value, and media buzz.

(Companies in the “Cloud-Infrastructure” space include: Actifio, AlienVault, Box.net, Centrify, CloudShare, Coraid, Delphix, Dropbox, Evolve IP, GoGrid, IntelePeer, LiveOps, Mu Dynamics, Opscode, Plantir Technologies, RainStor, RingCentral, Skytap, Sonian, Spiceworks, Syncplicity, Veracode and WhiteHat Security. Great job to all those companies!)

Again, thanks for being a GoGrid customer and making us even stronger and better as a company. We look forward to providing you with industry-leading technologies and services! For those who are not yet GoGrid customers, please be sure to contact us to find out why we are a recognized leader in the cloud industry.


In Part 1 of this Big Data series, I provided a background on the origins of Big Data.

But What is Big Data?

Port Vell Barcelona

The problem with using the term “Big Data” is that it’s used in a lot of different ways. One definition is that Big Data is any data set that is too large for on-hand data management tools. According to Martin Wattenberg, a scientist at IBM, “The real yardstick … is how it [Big Data] compares with a natural human limit, like the sum total of all the words that you’ll hear in your lifetime.” Collecting that data is a solvable problem, but making sense of it, (particularly in real time), is the challenge that technology tries to solve. This new type of technology is often listed under the title of “NoSQL” and includes distributed databases that are a departure from relational databases like Oracle and MySQL. These are systems that are specifically designed to be able to parallelize compute, distribute data, and create fault tolerance on a large cluster of servers. Some examples of NoSQL projects and software are: Hadoop, Cassandra, MongoDB, Riak and Membase.

The techniques vary, but there is a definite distinction between SQL relational databases and their NoSQL brethren. Most notably, NoSQL systems share the following characteristics:

  • Do not use SQL as their primary query language
  • May not require fixed table schemas
  • May not give full ACID guarantees (Atomicity, Consistency, Isolation, Durability)
  • Scale horizontally

Because of the lack of ACID, NoSQL is used when performance and real-time results are more important than consistency. For example, if a company wants to update their website in real time based on an analysis of the behaviors of a particular user interaction with the site, they will most likely turn to NoSQL to solve this use case.

However, this does not mean that relational databases are going away. In fact, it is likely that in larger implementations, NoSQL and SQL will function together. Just as NoSQL was designed to solve a particular use case, so do relational databases solve theirs. Relational databases excel at organizing structured data and is the standard for serving up ad-hoc analytics and business intelligence reporting. In fact, Apache Hadoop even has a separate project called Sqoop that is designed to link Hadoop with structured data stores. Most likely, those who implement NoSQL will maintain their relational databases for legacy systems and for reporting off of their NosQL clusters.

CloudBigData-300x239-resized-600

Big Data and the Cloud

The early adopters of Big Data were small web companies that grew to much larger companies with capital budgets that could be invested into dedicated data centers. However, with the incredible increase in the amount of data generated, collected, and analyzed, smaller companies can take advantage of the cloud and off-load the hardware management to those vendors. Two traits that many of these NoSQL solutions have in common make them a seemingly natural fit for the cloud: One is that the nodes are distributed, and the second is that they run on commodity hardware. The cloud is designed for horizontal scaling and often built on low-cost, commodity hardware, especially at the infrastructure-as-service (IaaS) layer, where customers simply need infrastructure and have the application expertise to build and configure their own Big Data application (whether it is with Hadoop, Cassandra, or any number of products).

Given what most users are trying to achieve with Big Data applications – large-scale data sets, large-scale analysis, often in real-time – performance is a key factor. Ideally, users will want a hybrid implementation that combines both virtual and dedicated servers. This gives maximum flexibility that balances the elastic, scalable nature of virtual machines with the single-tenancy of dedicated servers. Big Data projects don’t happen in a vacuum: while a NoSQL database can leverage dedicated servers, the app or web servers that present the results of the analysis to end users can easily be added to as many virtual machines as needed to meet demand. In addition, using the cloud means that users won’t need to invest in expensive equipment, pay for power and connectivity, or hire additional resources to maintain hardware. Users simply need to pay for the infrastructure that they need and have the ability to scale over time. The ability to scale up or down to match demand (and to only pay for the infrastructure that you use) is one of the values of using the cloud for Big Data.

With whatever solution that you select, you should also take into account the nature of the application and where you will want to house the processing and the output. The amount of data you collect, analyze and present will only increase over time. The advantage will go to companies that can collect and analyze this data quickly and efficiently, allowing them to react instantly to customer sentiment and to changing trends in the ever-quickening pace of business. Make sure to select the right infrastructure vendor who can match your performance criteria and has capacity to grow with you as your data and application needs increase to match the demands of your business.