When I think about the phrase “auto-scaling,” for some reason it conjures up the word “Transformers.” For those not familiar with the Transformers genre of cartoons, toys, games, and movies, it is essentially about cars that turn into robots or vise versa, depending on how you look at it. When they need to fight or confront a challenge, Transformers will scale up from a vehicle (a car, truck, airplane, etc.) into a much larger robot. Then, when the challenge subsides, they scale back down to a vehicle.
Scaling – in terms of infrastructure – is a similar concept, but applied to the horizontal or vertical scaling of servers. Horizontal scaling means adding (or removing) servers within an infrastructure environment. Vertical scaling involves adding resources to an existing server (like RAM).
Let’s look at an example. An author of a content creation website may write an article that attracts the attention of the social media community. What starts as a few views of the article per minute, once shared by many in social media, may result in hundreds or thousands of requests for this article per minute. When this spike in demand occurs, the load to the server or servers handling the website’s content may experience extreme load, affecting its ability to respond in a timely manner. The results can vary from long page loads to the server actually crashing under the additional peak load. In the past, this scenario used to be known as the “Digg effect” or “Slashdot effect.”
Although this type of success is great publicity for the author, it’s bad for the brand hosting the content. And, if users encounter slow or inaccessible websites, they’re less likely to return for other content at a later point, which can eventually result in a loss of revenue.
However, you can easily manage web load by simply throwing more hardware at the problem. The more servers you have (assuming your site is load balanced), the better it is able to handle the traffic spikes. (Of course, you could also scale up your individual servers by giving them more processing power and RAM.) You can scale both physical and virtual (cloud) servers. And you can do this process manually.
If you manually add servers to handle demand, however, you also run the risk of having those same servers sit idle when peak load subsides. This situation is clearly not cost-effective, especially if you purchased physical servers.
If you’re using cloud servers, you can always delete them easily (try doing that with a physical server…and explain the cost).
With the advent of cloud computing, this scaling process is much easier, and even better is the fact that you can programmatically control these server adds (or deletes) with an API. This is where the term “auto-scaling” truly comes into play. Before I go into the details, take a look at the short video below (also available directly on YouTube) which provides a high-level overview of how auto-scaling works:
The video cannot be shown at the moment. Please try again later.
Obviously, the video is an over-simplification of auto-scaling. The essentials are these:
- Have a load-balanced infrastructure
- Have “clones” of your application servers ready to be deployed
- Set up an “agent” to monitor load to your live infrastructure
- When load increases, automatically (via the API) deploy new server “clones” into the load-balanced environment
- When load subsides, automatically (via the API) remove the servers from the load-balanced pool
An API is a powerful tool to enable automation within your cloud infrastructure. It not only helps you save money by eliminating infrastructure resources when your capacity or demand doesn’t require them, it can also ensure that your customers remain satisfied by ensuring that your infrastructure is performing based on demand.
Soon we’ll be publishing an article on how you can create an auto-scaling web application using tools available to you in the GoGrid cloud. With this or a similarly architected, API-driven tool, you’ll not only be able to have a highly available website, but also one that’s fine-tuned for high performance.
Latest posts by Michael Sheehan (see all)
- Get Your Game On in the Cloud - June 11, 2013
- How Software Defined Networking Delivers Next-Generation Success - June 5, 2013
- James Gosling to Speak on Innovation at GoGrid Cloud Meetup on 5/22 - May 16, 2013