Selecting a Provider and Infrastructure for Running an In-Memory Database

August 19th, 2014 by - 2,834 views

The need for database speed is always a given. Recently, application response time has been shown to not only provide customers with a better experience, but also directly impact the bottom line. Think about companies running mobile advertising networks that are paid for delivering an advertising impression to users swiping away at their mobile phones to flip to the next screen. If the ad doesn’t load, well, that equals lost revenue. For these customers, response time is mission-critical. A common solution for applications that require fast response times is to run the database in memory, also known as an in-memory database (IMDB). You can easily do so in the cloud; however, selecting the appropriate infrastructure and even the appropriate provider can be tricky. Depending on the provider, for example, there may be hidden charges, less-than-ideal network topologies, and in many cases, a poor selection of virtual machines.


So how do you choose a reliable provider? And do you know what you’re looking for in terms of infrastructure? There are 3 key requirements that will help you get started:

1. Know your database memory requirements

2. Identify your cloud provider requirements

3. Understand your infrastructure requirements

Know your database memory requirements

This requirement seems obvious, but you’d be surprised how many people start by looking at the available servers from a particular provider rather than understanding their application profile. To fill out that profile, you’ll want to answer questions like these:

- How much memory do you need at the start?

- How fast will you need to add resources?

- How fast will your storage needs grow?

- What options are available for scaling elastically, if necessary?

Many databases run in memory can be much smaller because they eliminate things like processes for caching and file I/0 or software modules to manage routing data. In short, the database is optimized to conserve memory so it can be used for storage space.

At GoGrid, we find that customers who want to run in memory are typically looking for a minimum of 16 GB RAM per server, with a minimum of 3 to 5 servers running, depending on the application profile. Riak, for example, has an in-memory option so during development it’s fine to run 3 servers with 8 GB RAM; however, for production use, you’d want a minimum of five 16-GB servers to get the best performance. With the appropriate cloud provider, you can then simply scale out as demand increases.

If you know you’re going to be working with massive amounts of data right out the gate, you’d want larger options. GoGrid’s High RAM servers, for example, go all the way up to 256 GB RAM. With 5 of those servers running in a ring, you’re sure to get your project off to a blazing-fast start. DataStax Enterprise is another common NoSQL application that has in-memory options and would typically be run in a ring. As you can imagine, having access to resources that scale up as demand goes up becomes critical here.

Identify your cloud provider requirements

Not all cloud providers offer the network performance or the server selection needed to run effectively in memory. Of course, you can try and shoehorn your application into a less-than-optimal provider’s network and make it work, but honestly, why bother? You have better things to do and there are a few good options for running in memory. Here’s a short checklist that will help you select the right provider and get started with the appropriate infrastructure:

1. Are the in-memory options sitting on a high-performance network?

Look for a provider that offers a 10-Gbps network backbone. If the provider has a block storage option, ask yourself if you’ll want access to it at any point with your architecture. In other words, you don’t want to have 10 Gbps between servers that are accessing throttled block performing well below 1-Gbps limits. GoGrid’s backbone provides an optimized 40-Gbps dedicated private network for block so we can ensure it doesn’t become a bottleneck.

2. Does the provider meet your PCI, HIPAA, and other regulatory requirements?

Meeting these requirements is difficult for some providers. In many cases, providers won’t help customers maintain their PCI or HIPAA compliance, so you’ll want to think about this requirement up-front. What options does the provider offer? How do they protect you? Do they take security seriously?

3. What’s the provider’s uptime and does it have an SLA?

You’d be surprised how many providers don’t provide an SLA and don’t publish or share their uptime statistics. GoGrid consistently achieves 99.999% uptime, and we back it up with our 100% uptime guarantee. That means we pay you when our services go down.

4. Are there any hidden charges?

This is where things can get complicated because some providers are very good at hiding the actual costs of their services. The most common hidden charges are related to transactions and IOPS across the network. If you need to access block storage, for example, there are providers that will charge for those transactions—and those transactions can add up to a lot of money, depending on your architecture. Another common charge is for Support. Many providers ask you to pay extra to reach a representative via chat, a ticketing system, or phone. And the last big charge to look out for is account fees. Many providers charge a monthly fee just for opening an account and then make it difficult to close that account. That’s right: they’re actually making you pay while they continue to market to you and leverage your personal information.

5. Does the provider have infrastructure tuned for your application?

Most providers offer generic infrastructure that’s fine for general workloads and even for dev and test, but what about when it comes to running high-performance, in-memory databases? What about running MongoDB, Riak, or DataStax in memory? Does the provider have a recommended infrastructure? Has that infrastructure been tuned to meet the specific needs of your application? Unfortunately, most providers don’t go to great lengths to understand this differentiation and provide the necessary infrastructure in each case.

If you can answer the preceding questions, you should be on the right path toward selecting a provider. And keep in mind the provider at a minimum should have:

- A robust network that doesn’t become a bottleneck for your application

- The ability to meet your security and compliance needs

- Straightforward pricing so you can accurately forecast your costs

- Transparency when it comes to Support, uptime, and SLAs

- A flexible architecture and selection of infrastructure tuned to meet your application needs

Understand your infrastructure requirements

There are 3 key things you’ll want to look for with regard to your infrastructure.

1. What are the server options for running in memory? Do they have a recommendation?

If all the provider’s virtual servers are less than 16 GB RAM and aren’t optimized to run on high-performance network fabric or if the provider offers network performance that’s not backed up with an SLA and uptime metrics, look elsewhere—fast! The provider you select should have virtual server options ranging from 16 GB RAM to at least 128 GB RAM, and ideally all the way up to 256 GB RAM.

2. Are those options available on-demand or not?

It’s a new world out there and running in the cloud has significant benefits, thanks to the ability to scale servers horizontally to meet demand. In short, the cloud saves you money. But even if you’re not concerned with price and are willing to over-provision “just in case,” you’ll still want to know about the provider’s selection of dedicated servers. Can the provider build to meet your needs? If so, does the provider place cloud and dedicated servers on the same VLAN to ensure you benefit from the elastic nature of the cloud and can meet any specific requirements that mandate use of a dedicated box? This is a common requirement for folks trying to meet the high availability (HA) and data isolation requirements associated with PCI and HIPAA compliance while maintaining performance.

3. Beyond the servers, does the provider offer a CDN, multiple data centers, and options for failover?

Hey, running a high-performance, HA platform with an IMDB can be a significant investment. If uptime equals money in your situation, then you should look for a provider that can ensure you’re always up and running. Make sure the provider has options for monitoring and securing your environment. Things like DDoS mitigation services become critical if you’re running an ad-serving, search, or social platform, for example. Most providers don’t offer services to protect you when this type of event occurs, and even fewer can help you failover to another environment if and when you need to do so.

Once you’ve answered all these questions, you should have no problem finding a provider with the appropriate infrastructure to meet your IMDB needs. Above all, keep one key thing in mind: Don’t get sucked into thinking it’s all about running commodity infrastructure on a cheap network at a bargain price. Performance, availability, and the speed of your application will mean more money in your pocket in the long run. Taking the time to ensure your provider can meet your requirements and help when needed will save you a lot in the way of headaches—and probably also deliver increased revenue from “sticky” (loyal) customers.

The following two tabs change content below.

Kole Hicks

Senior Director of Product Management at GoGrid
Kole Hicks is the Senior Director of Product Management for GoGrid, the leader in Open Data Services (ODS) and committed to delivering purpose-built, non-opinionated Big Data solutions and services for the management and integration of open source, commercial, and proprietary technologies across multiple platforms..

Leave a reply