You’ve heard about scaling up, but what about scaling down? Imagine you’ve got an application with thousands of users connected at once after a big marketing push. You’ve planned your infrastructure so that it can grow with the demand. However, a spike in traffic won’t last forever. If you don’t plan properly, you can be left with a ton of unnecessary infrastructure after the rush, and if you’re not using it, those spare servers are doing nothing but costing you money.
The solution to this problem is to make your architecture elastic as well as scalable – but what exactly is elasticity in cloud computing?
What is elasticity?
The focus for many applications is scalability, which means the ability to scale up. The idea of scalability is that your application can handle bursts of traffic or resource-heavy jobs. This is handled by scaling up your architecture. A rule of thumb is that if you provision more resources then you can handle more traffic.
There are two ways to scale:
- Vertical – Adding resources to existing infrastructure. With cloud providers like AWS, this usually means upgrading to higher plans with more computing resources.
- Horizontal – Provisioning more infrastructure and distributing workloads across multiple instances. This method is generally more efficient for large applications, but requires more planning upfront.
An example of this situation is if your web application gets featured on a site like Hacker News or Product Hunt. When this happens, you’re likely to get a sudden rush of traffic. if you cannot scale up, then your application is likely to cripple under the load. The results can be incredibly damaging to your reputation – if people can’t use your site, they can’t see what you have to offer.
Scalability versus elasticity
Elasticity covers the ability to scale up but also the ability to scale down. The idea is that you can quickly provision new infrastructure to handle a high load of traffic, like the example above. But what happens after that rush? If you leave all of these new instances running, your bill will skyrocket as you will be paying for unused resources. In the worst case scenario, these resources can even cancel out revenue from the sudden rush. An elastic system prevents this from happening. After a scaled up period, your infrastructure can scale back down, meaning you will only be paying for your usual resource usage and some extra for the high traffic period.
The key is that this all happens automatically. When resource needs meet a certain threshold (usually measured by traffic), the system “knows” that it needs to de-provision a certain amount of infrastructure, and does so. With a couple hours of training, anyone can use the AWS web console to manually add or subtract instances. But it takes a true Solutions Architect to set up monitoring, account for provisioning time, and configure a system for maximum elasticity.
What can elasticity do for you?
Elasticity offers a few key benefits:
- Ability to scale up and handle high volumes of traffic
- Ability to scale down and use less resources when needed
- Keeps your users happy and your reputation good (scaling up)
- Saves you money (scaling down)
All of these principles are related to a central problem – avoiding both over-provisioning and under-provisioning. There is a fine line between not having the resources to run your application and wasting money on infrastructure you don’t need. Elasticity is all about smart, efficient architecture that finds a balance between “not enough” and “too much.”
Big deal! How much could a couple extra servers cost?
If you run a small e-commerce store or personal side project, your application is probably not big enough to require more than a few servers. Part of what makes cloud services like AWS so great is the low cost – you can run an EC2 instance for just a few dollars per month. Having an extra server or two might not seem like a big deal, and at that scale, it really isn’t.
The million server example
But what about giant companies like Microsoft, who have well over a million servers? It’s a bit of an apples-to-oranges comparison since Microsoft owns their own datacenters, but let’s ignore that for the sake of example.
In a one million server environment, even a 1% margin of error (10,000 servers) could be incredibly costly. Large companies often run dozens or even hundreds of different applications. When a number of servers are dedicated to one application, others can only scale up by creating entirely new instances.
Let’s imagine a large company has application A, which runs on 1,000 servers, and application B, which runs on another 1,000. So that’s a total of 2,000 instances. Now suppose application A only needs 500 servers to perform its function efficiently, but application B needs to scale up to 1,500 servers. In an elastic system, application A will scale down, and the 500 under-utilized servers will be available for application B to use when scaling up. Overall, the number of servers remains the same.
If the system was not designed with elasticity in mind, application B would simply provision 500 new servers. Overall, this would increase the number of servers to 2,500. The company is now paying for 500 servers they don’t need – even if each one only costs a few dollars a month, that’s several thousand dollars wasted.
Now imagine this example at the scale of tens of thousands of servers. Elasticity is absolutely critical to not only performance, but managing business costs. As more and more companies move to the cloud, costs will continue to grow. It’s easy to see the role of elasticity in making this transition as smooth as possible.
How can you make your application more elastic?
Apart from the benefits of being able to quickly scale up to handle bursts of traffic or quickly scale down and save money on resources it’s important to know about elasticity as a concept. Not many services offer the flexibility that AWS does with their products. When choosing a cloud provider, it’s important to take into consideration if they have some sort of ‘elasticity’ service. Can you imagine spending your entire budget on a cloud provider and spending weeks configuring your application and infrastructure to work with their services to only find out that you are stuck with the hardware you chose at the beginning of your contract?
Elasticity on Amazon Web Services
Elasticity is at the core of many AWS products. Many of the services even have the word in their name:
AWS offers a feature called Auto Scaling, which is used with the Elastic Compute Cloud (EC2) service. Auto scaling allows your EC2 instances to easily scale up or down depending on your requirements. Here are just a few auto scaling features:
- Scale up automatically when demand increases
- Scale down automatically when demand subsides
- Replace unreachable or stalled EC2 instances to maintain high availability
- Receive SNS notifications when auto scaling initiates or completes an action
AWS offers many ways to help make your application elastic. The Elastic Load Balancer scales automatically on demand with the traffic it receives for your application. It can also integrate with the Auto Scaling on your back end services to offer an end to end scaling layer to handle different levels of traffic.
As cloud usage grows, companies will face bigger and bigger challenges in managing their infrastructure. Elasticity is not the only way to manage costs and ensure that sites and applications can meet business needs – but it is an extremely important concept for Solutions Architects.
Think you have what it takes to implement elastic solutions in a real-world environment? Challenge yourself and prove you’re ready to lead the way into the future of cloud computing.