KML_FLASHEMBED_PROCESS_SCRIPT_CALLS

Is MapReduce Dead?

July 15th, 2014 by - 1,959 views

With the recent announcement by Google of Cloud DataFlow (intended as the successor to MapReduce) and with Cloudera now focusing on Spark for many of its projects, it looks like the days of MapReduce may be numbered. Although the change may seem sudden, it’s been a long time coming. Google wrote the MapReduce white paper 10 years ago, and developers have been using at least one distribution of Hadoop for about 8 years. Users have had ample time to determine the strengths and weaknesses of MapReduce. However, the release of Hadoop 2.0 and YARN clearly indicated that users wanted to live in a more diverse Big Data world.

spark-logo

Earlier versions of Hadoop could be described as MapReduce + HDFS (Hadoop Distributed File System) because that was the paradigm that everything Hadoop revolved around. Because users clamored for interactive access to Hadoop data, the Hive and Pig projects were started. And even though you could write SQL queries with Hive and script in Pig Latin with Pig, under the covers Hadoop was still running MapReduce jobs. That all changed in Hadoop 2.0 with the introduction of YARN. YARN became the resource manager for a Hadoop cluster that broke the dependence between MapReduce and HDFS. Although HDFS still remained as the file system, MapReduce became just another application that can interface with Hadoop through YARN. This change made it possible for other applications to now run on Hadoop through YARN.

Google is not known as a backer in the mold of Hortonworks or Cloudera with the open source Hadoop ecosystem. After all, Google was running its own versions of MapReduce and HDFS (the Google File System) on which these open-source projects are based. Because they are integral parts of Google’s internal applications, Google has the most experience with using these technologies. And although Cloud DataFlow is specifically for use on the Google cloud and appears more like a competitor to Amazon’s Kinesis product, Google is very influential in Big Data circles, so I can see other developers following Google’s lead and leveraging a similar technology in favor of MapReduce.

Although Google’s Cloud DataFlow may have a thought leadership-type impact, Cloudera’s decision to leverage Spark as the standard processing engine for its projects (in particular, Hive) will have a greater impact on open-source Big Data developers. Cloudera has one of the most popular Hadoop distributions on the market and has partnered with Databricks, Intel, MapR, and IBM to work on their Spark integration with Hive. This trend is surprising given Cloudera’s investment in Impala (its SQL query engine), but the company clearly feels that Spark is the future. As little as a year ago, Spark was mostly seen as fast in-memory computing for machine learning algorithms. However with its promotion to an Apache Top-Level Project in February 2014 and its backing company Databricks receiving $33 million in Series B funding, Spark clearly has greater ambitions. The advent of YARN made it much easier to tie Spark to the growing Hadoop ecosystem. Cloudera’s decision to leverage Spark in Hive and other projects makes it even more important to users of the CDH distribution.

spark-stack

Read the rest of this entry » «Is MapReduce Dead?»

Big Data Revolutionizes the Gaming Industry

July 10th, 2014 by - 2,718 views

There are few technologies that promise to improve as many different industries as Big Data. Whether it’s medicine or the weather in your backyard, the mass aggregation and analyzing of information could result in marked improvement and insight on nearly anything. The cloud computing technology may even change the way we have fun. It has already had an impressive effect on the video gaming industry and will have a great deal of influence on determining what the runaway hits of tomorrow will be. Here’s are a few small but important insights into the world of the gamer.

There are few technologies out there that can stand to improve so many different industries as big data can.

There are few technologies that promise to improve as many different industries as Big Data.

Big Data observes the user learn the game
The emerging technology offers those marketing and developing video game developers more insight than ever into what makes players tick, what makes them happy, and what keeps them engaged. Any game’s success is directly connected to the “addiction” factor – what is it about a certain game that makes users feel they can’t stop playing, and even more important, how can that feeling be monetized? To study this objective further, each activity must be stripped down to individual characters, levels, and other gameplay features to determine what works and what doesn’t.

Qubole writer Gil Allouche wrote a piece on how Big Data can be used to decide how difficult individual levels should be in future incarnations of any given game. A cloud server can track how long it takes each player to finish a level, indicating whether early levels are too simple and need to be beefed up in difficulty or are discouraging new users because they’re too challenging. Mass amounts of data can help narrow down the right decision for an individual game.

Increasing sales on cloud-based consoles
For nearly all current gaming systems, the Internet and cloud hosting have integrated seamlessly to foster more sales as well as engagement between players on massively popular interactive games. By basing the gaming store online with the ability to be accessed on the console itself, gamers are saved a trip to the store and can download a new experience right to their system in real time, giving them less time to question a decision and dive right into a purchase. Big Data also allows companies to better “recommend” similar games and products to the ones a gamer is already enjoying, increasing the likelihood of sealing a sale.

Real-world examples
EA Games is one of the largest video game developers and distributors on the planet, and announced a new commitment to improving its business model and products with the help of Big Data earlier this year. This will give the company a huge technological advantage, especially when it comes to targeting advertising and maximizing player-to-player interaction in major gaming successes like Activision’s “Call of Duty” franchise and EA’s own “Battlefield” franchise. Silicon Angle, a science and technology blog, reported on the gaming company’s major statement.

Read the rest of this entry » «Big Data Revolutionizes the Gaming Industry»

3 Unusual Uses of Big Data

July 4th, 2014 by - 3,194 views

When we think of Big Data, we tend to think big picture – massive amounts of information that is used to accomplish any goal a business or individual may have and that is quickly revolutionizing how we get things done. Although this impression may be true, it doesn’t place enough focus on those who are innovating within the field. Here are three organizations that demonstrate the fascinating range of Big Data technology and its results.

Three companies that are using big data to forward their industries.

Three companies that are using Big Data to advance their industries.

Pricing outdoor marketing
Route, an outdoor media analytics company, has thrown itself fully into Big Data in an attempt to revolutionize and question the standards for pricing advertising using conventional tools like billboards, bench ads, and the sides of transportation vehicles. In previous years, owners of these spaces have charged companies “per impression,” or for every time a viewer sees the advertisement, although there has never been a way to strictly quantify these impressions. Using cloud infrastructure, Route hopes to change that result by gathering live analytics to measure how high the impression rate actually is.

E-consultancy writer Ben Davis described how the company went about the study: “360,000 frames (bits of ad space) are analyzed, both their visibility, with eye tracking studies, and the audience size and demographic that come into contact with the ads,” he explained. “28,000 people were interviewed and then tracked across the U.K. by GPS. Part of this involves traffic studies, too.” This level of precision will allow Route to use Big Data to its advantage to justify pricing.

More accurate, collaborative weather forecasting
Collecting as much information as possible to develop a product has always been the name of the game in weather forecasting, but never before has cloud server hosting technology made it possible to crowd-source this data. An application called WeatherSignal launched in 2013 for Android gives its users the opportunity to collect atmospheric data with sensors that were already installed in their devices, according to an enthusiastic article in Scientific American. Though many of the readings are gathered from phones with varying degrees of sensor capability, the application’s relatively spot-on forecasts are a great example of how the Big Data model operates using mass amounts of “dirty data” as opposed to fewer reads of the atmosphere with more advanced equipment. With users offering their devices as a free forecasting tool, it’s a difficult source of information to resist.

Optimizing personal data and lifestyles
Advocates of Big Data in the office will be happy to learn they can take it home with them by making use of a number of devices aimed at optimizing a person’s daily routine. An excellent example is the UP wristband by Jawbone, which tracks daily activity to help users build a more structured, healthy lifestyle. The wristband collects data while its wearer walks, sleeps, and eats, and then integrates with a complimentary application that synthesizes the information to provide a concise report on the actions taken throughout each day. Bernard Marr, a Big Data analytics specialist, publicized some of the fascinating features of the device in a piece on LinkedIn.

Read the rest of this entry » «3 Unusual Uses of Big Data»

Big Data Changes How Diseases are Diagnosed

July 2nd, 2014 by - 1,788 views

It’s no secret that Big Data is in the process of revolutionizing how we view the world or that it has already transformed a number of industries in the past handful of years. One of the most fascinating areas of development is the health sector, which uses the technology to better diagnose and locate potential risks within patients through analyzing their medical history and symptoms to prevent a problem before it manifests itself. According to recent studies, research is now more critical than ever in nailing down a medical complication.

It's a truth of the contemporary health insurance industry that cutting down costs can be just as important as a patient's health - fortunately, big data offers the ability to improve both using analytic research processes.

Cutting costs can be just as important as a patient’s health – luckily, Big Data can potentially improve both using analytic research processes.

Insurers benefit from locating health risks before they occur
A recent piece in Information Week by Alison Diana explained how Big Data research has helped medical professionals identify a very specific condition using past records and symptoms as their guide.

“While organizations have used a lot of Big Data projects to discern trends, a study conducted by Aetna and GNS Healthcare analyzed data from almost 37,000 members of an Aetna employer customer who opted in for screening of metabolic syndrome – which can lead to chronic heart disease, stroke, and diabetes,” Diana expanded. “GNS analyzed information such as medical claims records, demographics, pharmacy claims, lab tests, and biometric screening results from a two-year period.”

Achieving this impressive result required a mass amount of information from cloud computing servers to narrow results down to a specific patient, something that wouldn’t have been possible to do before the emergence of Big Data technology. Because the risk for metabolic syndrome can now be identified far more quickly than in the past, health insurance providers are taking advantage of Big Data to restructure their systems to save money and reduce complications on behalf of their customers. Research indicated that the condition is 90 percent less likely to affect people if patients have secured primary health care providers and attend regular checkups – by extension, it behooves major providers to tailor their services to whichever processes keep expenses low. Diana spoke with Adam Scott, managing director of Aetna Innovation Labs, about how cloud computing will change the way health insurance is dispersed.

“If we can use information that we have on hand to understand more about disease and risk and provide that information to both our membership and those providers that care for those members, we can drive toward better value, delivered toward better outcomes,” Scott said.

Read the rest of this entry » «Big Data Changes How Diseases are Diagnosed»

Focus on Big Data at Big Telecom Event in Chicago

June 27th, 2014 by - 1,501 views

The Big Telecom Event is an annual summit held by industry publication Light Reading that gathers some of the most important figures in the industry together to discuss progress, problems, and what’s on the horizon as technology continues to develop at a rapid pace. This year’s conference, held at the Sheraton Towers in downtown Chicago, didn’t neglect the massive popularity of Big Data and its emerging uses, which was the main topic of discussion for the panel “The Customer-Driven Telco: Real-Time Analytics, Big Data & CEM.”

The talk was moderated by Heavy Reading Senior Analyst Ari Banerjee and included the following panelists: Adan Pope, the CTO of Business Unit Support Solutions at Ericsson; Amy Millard, the Vice President of Marketing for Support.com; Sid Harshavat, Security Architect for Symantec; and Kevin McGinnis, the Vice President of Development and Operations for Pinsight Media at Sprint.

Focus on cloud infrastructure and organization
Much of what the panel discussed was about using Big Data to its fullest potential – that is, organizing a cloud server and all its data to best serve the customer and the speed at which information can be delivered. Pope suggested that horizontal organization could be a major solution for companies looking to increase accessibility to data for the employees using it because a system with fewer middle levels won’t garble information unnecessarily.

Miller thought another effective way to use the cloud computing technology to its fullest potential was to put a higher emphasis on developing analytics to make sense of large amounts of research-based data much faster to best service a client.

“Organizationally, bringing analytics teams in earlier during development [would be useful],” she commented.

However, Big Data won’t organize itself based on a company’s whim – those involved in the management of data must decide what type of organizational structure makes sense for the needs of its staff before it can be created and used as a cloud infrastructure. CloudTweaks writer Syed Raza commented on the importance of a logical structure for an organization in a recent article.

Read the rest of this entry » «Focus on Big Data at Big Telecom Event in Chicago»