Archive for the ‘Engineering’ Category


Is MapReduce Dead?

Tuesday, July 15th, 2014 by

With the recent announcement by Google of Cloud DataFlow (intended as the successor to MapReduce) and with Cloudera now focusing on Spark for many of its projects, it looks like the days of MapReduce may be numbered. Although the change may seem sudden, it’s been a long time coming. Google wrote the MapReduce white paper 10 years ago, and developers have been using at least one distribution of Hadoop for about 8 years. Users have had ample time to determine the strengths and weaknesses of MapReduce. However, the release of Hadoop 2.0 and YARN clearly indicated that users wanted to live in a more diverse Big Data world.


Earlier versions of Hadoop could be described as MapReduce + HDFS (Hadoop Distributed File System) because that was the paradigm that everything Hadoop revolved around. Because users clamored for interactive access to Hadoop data, the Hive and Pig projects were started. And even though you could write SQL queries with Hive and script in Pig Latin with Pig, under the covers Hadoop was still running MapReduce jobs. That all changed in Hadoop 2.0 with the introduction of YARN. YARN became the resource manager for a Hadoop cluster that broke the dependence between MapReduce and HDFS. Although HDFS still remained as the file system, MapReduce became just another application that can interface with Hadoop through YARN. This change made it possible for other applications to now run on Hadoop through YARN.

Google is not known as a backer in the mold of Hortonworks or Cloudera with the open source Hadoop ecosystem. After all, Google was running its own versions of MapReduce and HDFS (the Google File System) on which these open-source projects are based. Because they are integral parts of Google’s internal applications, Google has the most experience with using these technologies. And although Cloud DataFlow is specifically for use on the Google cloud and appears more like a competitor to Amazon’s Kinesis product, Google is very influential in Big Data circles, so I can see other developers following Google’s lead and leveraging a similar technology in favor of MapReduce.

Although Google’s Cloud DataFlow may have a thought leadership-type impact, Cloudera’s decision to leverage Spark as the standard processing engine for its projects (in particular, Hive) will have a greater impact on open-source Big Data developers. Cloudera has one of the most popular Hadoop distributions on the market and has partnered with Databricks, Intel, MapR, and IBM to work on their Spark integration with Hive. This trend is surprising given Cloudera’s investment in Impala (its SQL query engine), but the company clearly feels that Spark is the future. As little as a year ago, Spark was mostly seen as fast in-memory computing for machine learning algorithms. However with its promotion to an Apache Top-Level Project in February 2014 and its backing company Databricks receiving $33 million in Series B funding, Spark clearly has greater ambitions. The advent of YARN made it much easier to tie Spark to the growing Hadoop ecosystem. Cloudera’s decision to leverage Spark in Hive and other projects makes it even more important to users of the CDH distribution.


(more…) «Is MapReduce Dead?»

When is “good enough” the right product decision?

Wednesday, April 23rd, 2014 by

“If you are not embarrassed by the first version of your product, you’ve launched too late.”
– Reid Hoffman, Founder, LinkedIn

Here’s the scenario: You started with a great idea, partnered with an excellent tech founder, and got $1M in funding so you could get the first release out the door. Part of your new-found funding went to hiring 3 Engineers. As the weeks of product development pass, you review the usability, demo it for prospects, and get feedback on how to make it better. The Engineers are working long hours to complete the first useful release for beta. When you review the usability with the Advisory Board or prospects, you get lots and lots of feedback about what works and what doesn’t, and you’re making changes—often daily.

One night you wake up and wonder, “Will we be tweaking this product forever? Will it ever get out the door so we can close some sales?” It’s time to have the conversation about what is “good enough” to ship. That means it’s time to revisit the original set of product requirements—the ones you and your team agreed needed to be implemented to ship the product. Go back to work with the team to completely scrub the bare minimum of what needs to be in the first version. Everyone will have opinions about what needs to be in the product when you ship. Justifications for including requirements may sound something like these:

“We won’t be able to reach one of our vertical market targets.”
“We’ll have a product that will only scale to 1M requests/timeframe and we need 10M.”
“Beta users hate the UI.”

During the scrub remember to ask, “What’s the cost of not implementing this functionality? Will we be able to add this functionality later without re-architecting the product?” Asking these questions lets you and the team make an informed business decision about minimal viable functionality. And at the end of your discussion, remember to reassure the team this sort of dialogue is healthy because it helps the company stay focused by prioritizing functionality into the releases on your road map—and ultimately drives your success.

A few years ago I had a great team that was working endless hours on a new workflow product. We started with requirements that were loosely defined and easily interpreted differently by each member of the team. Our usability expert seemed to re-interpret the same requirement each week, for example, but with the honest intent off making the product better. When it became clear we weren’t going to meet our functional complete date, I called the Engineers, PM, and QA together. As we scrubbed the requirements, we realized we were going to deliver 60% of what we originally thought was needed, but we still had very useful product. We finalized our definition by doing an in-scope/out-of-scope as a team for the rest of the company. And although it was a difficult conversation for the team to have, we delivered the first version—and got first mover advantage. So in the end, our 60%-ready first release actually turned out to be “good enough.”

Cloud infrastructure supports agile IT endeavors

Monday, August 5th, 2013 by

Companies often seek to use cloud computing technologies in an effort to improve business agility at a lower cost than other technical endeavors. Although hosted environments have an inherent flexibility that lets organizations carry out tasks more efficiently, decision-makers can’t simply deploy one cloud service and expect to reap all the rewards. Instead, enterprises need to ensure the cloud architectures they use have the necessary qualities to support a more agile workplace.

In today’s fast-paced business world, application agility is one of the best characteristics for an organization to have because it ensures employees can access and use mission-critical solutions from virtually anywhere. A recent CIO report report highlighted how leveraging an efficient cloud infrastructure service can dramatically improve efficiency as a result of its easy scalability and automated provisioning. When these characteristics are combined with other critical elements, companies can be sure they have the agile qualities they need to thrive.

Cloud infrastructure supports agile IT endeavors

Cloud infrastructure supports agile IT endeavors

Embrace agile development
In the past, there was one tried-and-true method for application development used by most of the business world. Today, the diversity of the corporate landscape has encouraged decision-makers to pursue strategies that differ from competitors to create room for possible advantages. This demand, coupled with the proliferation of cloud computing and mobile projects, has led to the emergence of the agile development movement.

CIO noted that this mentality is considered the norm in today’s enterprise, although many firms have yet to deploy these strategies effectively. By incorporating an agile development concept into the cloud infrastructure, employees can gain access to the automated tools they need to circumvent old processes that often resulted in unwanted, unused, or inefficient applications.

A separate First Line Software report echoed the importance of including the cloud in an agile development strategy because the hosted technology supports greater levels of service delivery and encourages users to take advantage of its scalable capacity. When enterprises leverage cloud and agile initiatives simultaneously, they can streamlines the creation and deployment process to ensure employees can take full advantage of the tools in a timely manner.

(more…) «Cloud infrastructure supports agile IT endeavors»

How Software Defined Networking Delivers Next-Generation Success

Wednesday, June 5th, 2013 by

Software defined networking (SDN) is today where the cloud was a few years ago, and their paths are quite similar. As cloud providers innovate, they incorporate new, cutting-edge technology to let users do more with their architectures and enable solutions that were previously impossible. Just as the cloud moved people away from physical boxes and bare metal devices, SDN is allowing developers and architects to divorce themselves from proprietary hardware appliances like load balancers and firewalls.

So, what are the similarities between SDN and cloud? How about abstraction or the movement from physical to virtual?

To get a bit more scientific, I jumped over to Google Trends (which looks at search term volume over time) and did a search for “cloud,” “SDN,” “cloud computing,” and “software defined networking.”


The results shown here make it pretty obvious that “cloud” continues to grow and overshadow the other terms. Removing “cloud” shows “SDN” making the same upward trajectory as “cloud” does in the graphic below. (Because people have been shortening the term “cloud computing” to simply “cloud,” it’s logical that the term’s search volume is decreasing.)


(more…) «How Software Defined Networking Delivers Next-Generation Success»

GoGrid Proactively Responds to Xen Vulnerability

Wednesday, June 20th, 2012 by

GoGrid regularly reviews, analyzes, and ranks recently published security vulnerabilities as part of its security program. We typically address security vulnerabilities that pose a risk to GoGrid’s digital ecosystem during our regular patch cycle. However, critical security vulnerabilities require immediate action. Such was the case with last week’s security advisory that impacted software such as Xen, FreeBSD, NetBSD, and some versions of Microsoft Windows. You can find specifics of the security advisory here:

vaultThe vulnerability meant a system admin running a 64-bit paravirtualized (PV) guest (such as Windows 2008 R2 or a Linux 64-bit distribution) on a 64-bit hypervisor could gain kernel-level access by successfully exploiting Intel’s SYSRET design implementation. This vulnerability isn’t unique to Xen or even to virtualized environments. In fact, any guest user—that is, someone with non-administrator privileges—with logical access to a stand-alone server running NetBSD, FreeBSD, Microsoft Windows 7, or Windows 2008 R2 can perform a similar exploit against the OS and gain unauthorized access.

GoGrid’s Security team determined that the vulnerability exposed our customers to an attacker potentially gaining access to their virtualized systems. Even more important, GoGrid’s Security team determined the vulnerability was a prime target for a “zero-day exploit”—one that could occur on the same day the vulnerability becomes generally known.

As a result, we took immediate action: We downloaded and tested the patch, engaged on of our outside security firm partners to gain intelligence on how the Black Hat community perceived the vulnerability, scheduled an emergency patch rollout over the weekend, and deployed the security patch across all impacted systems.

On June 18, 2012, GoGrid Security team confirmed that an exploit had been published and is now circulating on the Internet.

We appreciate your understanding and support in allowing us to continue providing you with a safe, secure, and stable environment.