Author Archive

 

How to Easily Deploy MongoDB in the Cloud

Monday, February 3rd, 2014 by

GoGrid has just released it’s 1-Button Deploy™ of MongoDB, available to all customers in the US-West-1 data center. This technology makes it easy to deploy either a development or production MongoDB replica set on GoGrid’s high-performance infrastructure. GoGrid’s 1-Button Deploy™ technology combines the capabilities of one of the leading NoSQL databases with our expertise in building high-performance Cloud Servers.

MongoDB is a scalable, high-performance, open source, structured storage system. MongoDB provides JSON-style document-oriented storage with full index support, sharding, sophisticated replication, and compatibility with the MapReduce paradigm. MongoDB focuses on flexibility, power, speed, and ease of use. GoGrid’s 1-Button Deploy™ of MongoDB takes advantage of our SSD Cloud Servers while making it easy to deploy a fully configured replica set.

Why GoGrid Cloud Servers?

SSD Cloud Servers have several high-performance characteristics. They all come with attached SSD storage and large available RAM for the high I/O uses common to MongoDB. MongoDB will attempt to place its working set in memory, so the ability to deploy servers with large available RAM is important. Plus, whenever MongoDB has to write to disk, SSDs provide for a more graceful transition from memory to disk. SSD Cloud Servers use a redundant 10-Gbps public and private network to ensure you have the maximum bandwidth to transfer your data. You can use can GoGrid’s 1-Button Deploy™ to provision either a 3-server development replica set or a 5-server production replica set with Firewall Service enabled.

Development Environments

The smallest recommended size for a development replica set is 3 servers. Although it’s possible to run MongoDB on a single server, you won’t be able to test failover or how a replica set behaves in production. You’ll most likely have a small working set so you won’t need as much RAM, but will still benefit from SSD storage and a fast network.

(more…) «How to Easily Deploy MongoDB in the Cloud»

Big Data Cloud Servers for Hadoop

Monday, January 13th, 2014 by

GoGrid just launched Raw Disk Cloud Servers, the perfect choice for your Hadoop data node. These purpose-built Cloud Servers run on a redundant 10-Gbps network fabric on the latest Intel Ivy Bridge processors. What sets these servers apart, however, is the massive amount of raw storage in JBOD (Just  a Bunch of Disks) configuration. You can deploy up to 45 x 4 TB SAS disks on 1 Cloud Server.

These servers are designed to serve as Hadoop data nodes, which are typically deployed in a JBOD configuration. This setup maximizes available storage space on the server and also aids in performance. There are roughly 2 cores allocated per spindle, giving these servers additional MapReduce processing power. In addition, these disks aren’t a virtual allocation from a larger device. Each volume is actually a dedicated, physical 4 TB hard drive, so you get the full drive per volume with no initial write penalty.

Hadoop in the cloud

Most Hadoop distributions call for a name node supporting several data nodes. GoGrid offers a variety of SSD Cloud Servers that would be perfect for the Hadoop name node. Because they are also on the same 10-Gbps high-performance fabric as the Raw Disk Cloud Servers, SSD servers provide low latency private connectivity to your data nodes. I recommend using at least the X-Large SSD Cloud Server (16 GB RAM), although you may need a larger server, depending on the size of your Hadoop cluster. Because Hadoop stores metadata in memory, you’ll want more RAM if you have a lot of files to process. You can use any size Raw Disk Cloud Server, but you’ll want to deploy at least 3. Also, each Raw Disk Cloud Server has a different allocation of raw disks, which are illustrated in the table below. The Cloud Server in the illustration is the smallest size that has multiple disks per Cloud Server. Hadoop defaults to a replication factor of three, so to protect your data from failure, you’ll want to have at least 3 data nodes to distribute data. Although Hadoop attempts to replica data to different racks, there’s no guarantee that your Cloud Servers will be on different racks.

Note that the example below is for illustrative purposes only and is not representative of a typical Hadoop cluster; for example, most Cloudera and Hortonworks sizing guides start at 8 nodes. These configurations can differ greatly depending on if you intend to use the cluster for development, production, or production with HBase added. This includes the RAM and disk sizes (less of both for development, most likely more for HBase). Plus, if you’re thinking of using these nodes for production, you should consider adding a second name node.

Hadoop-cluster (more…) «Big Data Cloud Servers for Hadoop»

Connect from Anywhere to the Cloud

Thursday, August 29th, 2013 by

Bay Bridge in the dusk

The cloud is an important part of many companies’ IT strategies. However, there are many companies that have already made a large investment in infrastructure in their data centers. How can they take advantage of all the cloud has to offer without abandoning their investment? The answer is Cloud Bridge – private, dedicated access to the GoGrid cloud from anywhere.

Connecting to the Cloud

Cloud Bridge is your access point into the GoGrid cloud. It supports Layer 2 connections from cross-connects within a partner data center or with carrier connections from just about anywhere. Cloud Bridge is designed to be simple –  just select the port speed you prefer: 100 Mbps, 1 Gbps, or 10 Gbps (only in US-East-1). There’s also no long-term commitment required to use Cloud Bridge – pay only for what you use and cancel anytime. Traffic across Cloud Bridge is unmetered, so you only pay for access to the port. You also have the option of selecting a redundant setup: Purchase two ports in a redundant configuration and you’ll get an aggregate link. Not only will your traffic have physical redundancy, but you’ll also get all the speed available to both ports (for example, 2 Gbps of bandwidth with redundant 1-Gbps ports selected). You can access Cloud Bridge from equipment that you have in GoGrid’s Co-Location Service, a partner data center (like Equinix via a cross-connect), or from your data center using one of your carriers or with one of our partner resellers.

Why Cloud Bridge

Customers that want to use Cloud Bridge are typically looking to solve the following use cases: (more…) «Connect from Anywhere to the Cloud»

Geographic Load Balancing and Disaster Recovery Best Practices for Global Websites

Wednesday, August 21st, 2013 by

Technological World

If you’re running a global website, you’ll want to reduce the latency for customers around the world. GoGrid offers the global infrastructure and robust network to support this setup. With Geographic Load Balancing, GoGrid can also improve performance to your website from around the world. Here are recommended best practices for building a reliable, high-performing global website.

Deploying the Correct Infrastructure Setup

Global websites still require local infrastructure to be truly effective in reducing latency. GoGrid has data centers around the world where you can deploy infrastructure to better serve your customers. Deploy infrastructure to the Western United States (in our US-West-1 data center), Eastern United States (in US-East-1) and Europe (EU-West-1). Although your specific configuration is unique to your setup, you’ll most likely have database and webservers in each of these data centers.

In addition, you’ll want to keep your servers in-sync. One option between US-West-1 and US-East-1 is to use Cloud Link, a dedicated, private line between our data centers. This connectivity makes synching your servers secure and easy. Once you have your back end in place, you’ll want to configure your front end.

Geographic Load Balancing

(more…) «Geographic Load Balancing and Disaster Recovery Best Practices for Global Websites»

The 2013 Hadoop Summit

Monday, July 29th, 2013 by

hadoop_summit_logo

I recently attended the Hadoop Summit in San Jose. This is one of two major conferences organized around Hadoop, the other being Hadoop World. Nearly all the companies with Hadoop distributions were present along with several big users of Hadoop like Netflix, Twitter, and Linkedin.

Crossing The Chasm

If you’re not deeply involved with Hadoop, attending one of these conferences a year apart can be shocking. The advancements made in just the span of a year are amazing. The conference seemed notably larger this year, and I noticed more non-tech companies in the audience. I think it’s safe to say that Hadoop has crossed the chasm, at least for enterprise IT users.

Other than the type of attendees at the event, the other signal to me was the emergence of Hadoop 2.0. This second version of Hadoop focused on features that are important for users who want to run production-grade software for mission-critical systems. High-availability finally arrived for the name node (for the Open Source project, not the version Cloudera released for its distribution), a new version of Hive with more SQL-friendly features, and YARN which allows users to run just about anything on the Hadoop Distributed File System (HDFS). These types of stability and availability features tend to show up when there is a critical mass of users who want to use software for production.

Hadoop_0790

Quite A YARN

(more…) «The 2013 Hadoop Summit»