Archive for the ‘Big Data’ Category


Comparing Cloud Infrastructure Options for Running NoSQL Workloads

Friday, April 11th, 2014 by

A walk through in-memory, general compute, and mass storage options for Cassandra, MongoDB, Riak, and HBase workloads

I recently had the pleasure of attending Cassandra Tech Day in San Jose, a developer-focused event where people were learning about various options for deploying Cassandra clusters. As it turns out, there was a lot of buzz surrounding the new in-memory option for Cassandra and the use cases for it. This interest got me thinking about how to map the options customers have for running Big Data across clouds.

For a specific workload, NoSQL customers may want to have the following:

1. Access to mass storage servers for files and objects (not to be confused with block storage). Instead, we’re talking on-demand access to terabytes of raw spinning disk volumes for running a large storage array (think storage hub for Hadoop/HBase, Cassandra, or MongoDB).

2. Access to High RAM options for running in-memory with the fastest possible response times—the same times you’d need when running the in-memory version of Cassandra or even running Riak or Redis in-memory.

3. Access to high-performance SSDs to run balanced workloads. Think about what happens after you run a batch operation. If you’re relating information back to a product schema, you may want to push that data into something like PostgrSQL, SQL, or even MySQL and have access to block storage.

4. Access to general-purpose instances for dev and test or for workloads that don’t have specific performance SLAs. This ability is particularly important when you’re trialing and evaluating a variety of applications. GoGrid’s customer’s, for example, leverage our 1-Button Deploy™ technology to quickly spin up dev clusters of common NoSQL solutions from MongoDB to Cassandra, Riak, and HBase.

(more…) «Comparing Cloud Infrastructure Options for Running NoSQL Workloads»

Be Prepared with a Solid Cloud Infrastructure

Thursday, April 10th, 2014 by

The more Big Data enterprises continue to amass, the more potential risk is involved. It would be one matter if it was simply raw material without any clearly defined meaning; however data analytics tools—combined with the professionalism of tech-savvy employees—allow businesses to harvest profit-driving, actionable digital information.

Recovery disks shattering

Compared to on-premise data centers, cloud computing offers multiple disaster recovery models.

Whether the risk is from a a cyber-criminal who gains access to a database or a storm that cuts power, it’s essential for enterprises to have a solid disaster recovery plan in place. Because on-premise data centers are prone to outages in the event of a catastrophic natural event, cloud servers provide a more stable option for companies requiring constant access to their data. Numerous deployment models exist for these systems, and most of them are constructed based on how users interact with them.

How the cloud can promote disaster recovery 
According to a report conducted by InformationWeek, only 41 percent of respondents to the magazine’s 2014 State of Enterprise Storage Survey stated they have a disaster recovery (DR) and business continuity protocol and regularly test it. Although this finding expresses a lack of preparedness by the remaining 59 percent, the study showed that business leaders were beginning to see the big picture and placing their confidence in cloud applications.

The source noted that cloud infrastructure and Software-as-a-Service (SaaS) automation software let organizations  deploy optimal DR without the hassle associated with a conventional plan. Traditionally, companies backed up their data on physical disks and shipped them to storage facilities. This method is no longer workable because many enterprises are constantly amassing and refining new data points. For example, Netflix collects an incredible amount of specific digital information on its subscribers through its rating system and then uses it to recommend new viewing options.

The news source also acknowledged that the issue isn’t just about recovering data lost during the outage, but about being able to run the programs that process and interact with that information. In fact, due to the complexity of these infrastructures, many cloud hosts offer DR-as-a-Service.

(more…) «Be Prepared with a Solid Cloud Infrastructure»

Infographic: 2014 – The Year of Open Source?

Tuesday, April 8th, 2014 by

If you’re a software developer, you’ve probably already used open-source code in some of your projects. Until recently, however, people who aren’t software developers probably thought “open source” referred to a new type of bottled water. But all that’s beginning to change. Now you can find open-source versions of everything from Shakespeare to geospatial tools. In fact, the first laptop built almost entirely on open source hardware just hit the market. In the article announcing the new device, Wired noted that, “Open source hardware is beginning to find its own place in the world, not only among hobbyists but inside big companies such as Facebook.”


Why now?

Open source technology has moved from experiment to mainstream partly because the concept itself has matured. Companies that used to zealously guard their proprietary software or hardware may now be building some or all of it on open-source code and even giving back to the relevant communities. Plus repositories like GitHub, Bitbucket, and SourceForge make access to open-source code easy.

In its annual “Future of Open Source Survey,” North Bridge Venture Partners summarized 3 reasons support for open source is broadening:

1. Quality: Thanks to strong community support, the quality of open-source offerings has improved dramatically. They now compete with proprietary or commercial equivalents on features–and can usually be deployed more quickly. Goodbye vendor “lock-in.”

(more…) «Infographic: 2014 – The Year of Open Source?»

Deploying Cassandra with the Push of a Button on GoGrid

Thursday, April 3rd, 2014 by

Let’s say you’ve already done your due diligence and decided you want to run a NoSQL database. The only problem is that you’ve now got to figure out how to deploy the cluster in an environment that lets you scale within a single data center and also across multiple data centers. To save money, this is when many people trial Cassandra on cheap hardware with limited RAM across clusters that are simply inadequate for the job.

That’s a mistake, but luckily, there’s a better way. At GoGrid, we’ve made it possible to deploy a production-ready 5-node Cassandra cluster on robust, high-performance machines with the click of a button. Check out the specs of the orchestrated deployment we’re providing using our 1-Button Deploy™ technology:

  • SSD nodes: 16 GB RAM, 16 cores, and 640 GB storage per node
  • 10-Gbps redundant, high-performance network
  • 40-Gbps private network connectivity to additional Block Storage volumes (as needed)


Once you’ve deployed the first cluster, you can add more nodes as you need them via simple point-and-click. Consider for a moment what you can do with this technology: You can run a user/session store for your application, run a distributed priority job queue, use it to manage sensor data, or any number of other things with just a few clicks of the mouse. And you can do it all in 3 easy steps:

Step 1: 1-Button Deploy™

(more…) «Deploying Cassandra with the Push of a Button on GoGrid»

Healthcare Industry Balances Big Data Insights and Patient Care

Wednesday, April 2nd, 2014 by

The United States health care industry is undergoing a revolutionary change. Between the Affordable Care Act’s influence over the insurance landscape and the Centers for Medicare and Medicaid Services’ (CMS’s) push for electronic health record (EHR) adoption, medical organizations are under an enormous amount of pressure. Amid the chaos, cloud computing and associated technologies have offered these professionals a measure of solace as they transition into a more digital environment.

A doctor uses her tablet to obtain patient information.

A doctor uses her tablet to obtain patient information.

The patient comes first 
In today’s fast-changing marketplace, it’s easy for those using Big Data to lose sight of what matters to individual care receivers. Analytics programs have given companies outside the industry, such as Netflix, accurate, near real-time insight into subscriber entertainment preferences. For healthcare, using Big Data to assemble actionable information about widespread diseases is critical, but it’s also important to use the same predicative tools to assist individual patients.

As is often the case, however, implementation may be easier said than done. According to CMO magazine, many companies lack the IT architecture necessary for an analytics program to operate to its full potential. Similarly, a large number of hospitals still use on-premise data centers as opposed to cloud infrastructure. And although CMS has instigated use of EHR, many hospitals have forced those programs to work on systems that don’t offer the same flexibility as cloud computing.

The whole premise of the EHR initiative was to create an environment that allowed doctors, nurses, and hospital administrators to easily access patient information. To date, facilities that have adopted the technology have been able to do so, but at a much slower pace than anticipated. Plus, the data within the individual records can’t be dissected by algorithms to figure out which treatment methods would best suit particular care receivers.

The Big Data advantage
Yet, professionals can’t deny the fact that Big Data has a place in the healthcare industry. Elena Malykhina, a contributor to InformationWeek, said that a report conducted by EMC, funded by the federal government, revealed 63 percent of public IT professionals believe that analytics tools will help monitor and manage population health more efficiently. An additional 60 percent reported that the technology will improve how preventive care is delivered.

(more…) «Healthcare Industry Balances Big Data Insights and Patient Care»