Posts Tagged ‘analytics’

 

The 2013 Hadoop Summit

Monday, July 29th, 2013 by

hadoop_summit_logo

I recently attended the Hadoop Summit in San Jose. This is one of two major conferences organized around Hadoop, the other being Hadoop World. Nearly all the companies with Hadoop distributions were present along with several big users of Hadoop like Netflix, Twitter, and Linkedin.

Crossing The Chasm

If you’re not deeply involved with Hadoop, attending one of these conferences a year apart can be shocking. The advancements made in just the span of a year are amazing. The conference seemed notably larger this year, and I noticed more non-tech companies in the audience. I think it’s safe to say that Hadoop has crossed the chasm, at least for enterprise IT users.

Other than the type of attendees at the event, the other signal to me was the emergence of Hadoop 2.0. This second version of Hadoop focused on features that are important for users who want to run production-grade software for mission-critical systems. High-availability finally arrived for the name node (for the Open Source project, not the version Cloudera released for its distribution), a new version of Hive with more SQL-friendly features, and YARN which allows users to run just about anything on the Hadoop Distributed File System (HDFS). These types of stability and availability features tend to show up when there is a critical mass of users who want to use software for production.

Hadoop_0790

Quite A YARN

(more…) «The 2013 Hadoop Summit»

Create a Basho Riak Cluster on GoGrid

Monday, July 9th, 2012 by

Basho is a GoGrid partner and responsible for the open-source Riak project. If you are not familiar with Riak, it is a well regarded open-source distributed database. It was built off of the Dynamo concept so it is often compared to Cassandra and Amazon Dynamo DB.

Riak is used as a fast, fault-tolerant distributed database. Companies like Mozilla use it for storing and analyzing beta testing results. Mozilla needed a solution to help improve the user experience and that would allow them to store large amounts of data very quickly. Another example of a company using Riak is Bump which uses Riak to scale and manage massive amounts of data sent between it’s millions of users. Riak is used to store elements of past user conversations so that communication history is readily accessible to users.

basho_logo2

Basho Riak version 1.1.4 is now available as a GoGrid Community Server Image (CGSI). You can find it when you launch a virtual machine and search for “Riak”. This image is available in all our data centers. This CGSI contains the open source version so support is only available via the community site and will not have all the features present in the Enterprise version. However, you can use this image to either run a proof of concept (POC) of Riak to see if it will meet your needs or to run a small cluster. These will run on GoGrid’s high performance VMs which have been shown to have significant performance advantages over other cloud implementations.

Riak_image

Why is GoGrid faster?

(more…) «Create a Basho Riak Cluster on GoGrid»