What is Auto-Scaling, How Does it Work, & Why Should I Use it?

March 11th, 2013 by - 23,058 views

When I think about the phrase “auto-scaling,” for some reason it conjures up the word “Transformers.” For those not familiar with the Transformers genre of cartoons, toys, games, and movies, it is essentially about cars that turn into robots or vise versa, depending on how you look at it. When they need to fight or confront a challenge, Transformers will scale up from a vehicle (a car, truck, airplane, etc.) into a much larger robot. Then, when the challenge subsides, they scale back down to a vehicle.

Transformers 4 Movie

Image source:

Scaling Explained

Scaling – in terms of infrastructure – is a similar concept, but applied to the horizontal or vertical scaling of servers. Horizontal scaling means adding (or removing) servers within an infrastructure environment. Vertical scaling involves adding resources to an existing server (like RAM).

Let’s look at an example. An author of a content creation website may write an article that attracts the attention of the social media community. What starts as a few views of the article per minute, once shared by many in social media, may result in hundreds or thousands of requests for this article per minute. When this spike in demand occurs, the load to the server or servers handling the website’s content may experience extreme load, affecting its ability to respond in a timely manner. The results can vary from long page loads to the server actually crashing under the additional peak load. In the past, this scenario used to be known as the “Digg effect” or “Slashdot effect.”

Although this type of success is great publicity for the author, it’s bad for the brand hosting the content. And, if users encounter slow or inaccessible websites, they’re less likely to return for other content at a later point, which can eventually result in a loss of revenue.

However, you can easily manage web load by simply throwing more hardware at the problem. The more servers you have (assuming your site is load balanced), the better it is able to handle the traffic spikes. (Of course, you could also scale up your individual servers by giving them more processing power and RAM.) You can scale both physical and virtual (cloud) servers. And you can do this process manually.

Manually add to infrastructure

Manually add to your infrastructure

If you manually add servers to handle demand, however, you also run the risk of having those same servers sit idle when peak load subsides. This situation is clearly not cost-effective, especially if you purchased physical servers.

If you’re using cloud servers, you can always delete them easily (try doing that with a physical server…and explain the cost).

Manually remove servers

Manually remove servers

Auto-Scaling Explained

With the advent of cloud computing, this scaling process is much easier, and even better is the fact that you can programmatically control these server adds (or deletes) with an API. This is where the term “auto-scaling” truly comes into play. Before I go into the details, take a look at the short video below (also available directly on YouTube) which provides a high-level overview of how auto-scaling works:

The video cannot be shown at the moment. Please try again later.

Obviously, the video is an over-simplification of auto-scaling. The essentials are these:

  • Have a load-balanced infrastructure
  • Have “clones” of your application servers ready to be deployed
  • Set up an “agent” to monitor load to your live infrastructure
  • When load increases, automatically (via the API) deploy new server “clones” into the load-balanced environment
  • When load subsides, automatically (via the API) remove the servers from the load-balanced pool
Automate auto-scaling infrastructure with GoGrid API

Automate auto-scaling infrastructure with GoGrid API

An API is a powerful tool to enable automation within your cloud infrastructure. It not only helps you save money by eliminating infrastructure resources when your capacity or demand doesn’t require them, it can also ensure that your customers remain satisfied by ensuring that your infrastructure is performing based on demand.

Automatically add infrastructure using GoGrid API based on demand

Automatically add infrastructure using GoGrid API based on demand

Soon we’ll be publishing an article on how you can create an auto-scaling web application using tools available to you in the GoGrid cloud. With this or a similarly architected, API-driven tool, you’ll not only be able to have a highly available website, but also one that’s fine-tuned for high performance.

The following two tabs change content below.

Michael Sheehan

Michael Sheehan, formerly the Technology Evangelist for GoGrid, is a recognized technology, social media, and cloud computing pundit and blogger who writes regularly about technology news and trends.

Leave a reply