KML_FLASHEMBED_PROCESS_SCRIPT_CALLS

Archive for the ‘Big Data’ Category

 

How Big Data Tells a Story

Thursday, July 24th, 2014 by

Associations with Big Data tend to be pretty clinical – it’s often considered a tool to make more accurate scientific statements, identify trends in social media and news, and develop products by gauging customer response. In other words, the cloud computing tool was largely viewed as a shortcut to making money and creating new offerings for the public, whether that was a breakthrough medication, a new way to communicate wirelessly, or something the world had never even heard of. A less common but equally fascinating use of the technology, however, is as a storytelling mechanism – a capability that can be the most powerful use of all.

As has been the truth in past generations, science and storytelling should coexist in order to remain powerful, a fact that rings true when considering the developing uses of big data.

As in previous generations, science and storytelling need to coexist to remain powerful, a fact that rings true when considering the developing uses of Big Data.

The value of storytelling
The concept of storytelling and the value of its teller is a tradition ingrained in basic human culture that has existed for thousands of years. In generations past, before the written word and widespread publishing of books and magazines, storytellers would enthrall listeners with memorized speeches in the manner of Ovid’s “Metamorphoses” and Homer’s “Odyssey.” A recent piece on the Fast CoCreate blog detailed some of the finer points of this tradition.

“Results repeatedly show that our attitudes, fears, hopes, and values are strongly influenced by story,” the source stated. “In fact, fiction seems to be more effective at changing beliefs than writing that is specifically designed to persuade through argument and evidence.”

These statements have plenty of evidence to back them up – stories sell. The movie and publishing industry bring in billions every year, and even our most prevalent social media tools, especially Facebook, are designed to tell the “story” of a user’s life online by highlighting what events and posts have received the most attention. This is just one example of mass data being boiled down to a basic storyline, but it’s a valuable one. Even Snapchat, the ever-present application that is famous for showing a user an image for a few seconds that disappears shortly thereafter, has introduced the “Snapchat Stories” feature that lets users create a narrative from their brief messages.

How Big Data tells a story with accuracy and impact
There’s no doubt that the science behind Big Data is inescapable, but some data scientists have struggled to transform this information into a palatable story for the everyday user to consume. Jeff Bladt and Bob Filbin, data scientists for the activist charity-driven website Dosomething.org, wrote about this process, with which they’re still constantly experimenting, in a recent issue of Harvard Business Review.

(more…) «How Big Data Tells a Story»

Infographic: Big Data or Big Confusion? The Key is Open Data Services

Tuesday, July 22nd, 2014 by

When folks refer to “Big Data” these days, what is everyone really talking about? For several years now, Big Data has been THE buzzword used in conjunction with just about every technology issue imaginable. The reality, however, is that Big Data isn’t an abstract concept. Whether you like it or not, you’re already inundated with Big Data. How you source it, what insights you derive from it, and how quickly you act on it will play a major role in determining the course—and success—of your company. To help you get started understanding the key Big Data trends, take a look at this infographic: “60-Second Guide to Big Data and the Cloud.”

GoGrid_BigData_revised_300

Handling the increased volume, variety, and velocity—the “3V/s”—of data (shown in the center of the infographic) requires a fundamental shift in the makeup of the platform required to capture, store, and analyze the data. A platform that’s capable of handling and capitalizing on Big Data successfully requires a mix of structured data-handling relational databases, unstructured data-handling NoSQL databases, caching solutions, and map reducing Hadoop-style tools.

As the need for new technologies to handle the “3V/s” of Big Data has grown, open source solutions have become the catalysts for innovation, generating a steady launch of new, relevant products to tackle Big Data challenges. Thanks to the skyrocketing pace of innovation in specialized databases and applications, businesses can now choose from a variety of proprietary and open source solutions, depending on the database type and their specific database requirements.

Given the wide variety of new and complex solutions, however, it’s no surprise that a recent survey of IT professionals showed that more than 55% of Big Data projects fail to achieve their goals. The most significant challenge cited was a lack of understanding of and the ability to pilot the range of technologies on the market. This challenge systematically pushes companies toward a limited set of proprietary platforms that often reduce the choice down to a single technology. Perpetuating the tendency to seek one cure-all technology solution is no longer a realistic strategy. No single technology such as a database can solve every problem, especially when it comes to Big Data. Even if such a unique solution could serve multiple needs, successful companies are always trialing new solutions in the quest to perpetually innovate and thereby achieve (or maintain) a competitive edge.

Open Data Services and Big Data go hand-in-hand

(more…) «Infographic: Big Data or Big Confusion? The Key is Open Data Services»

How Big Data can Help Reduce Pollution

Thursday, July 17th, 2014 by

As Big Data continues to become a part of our everyday lives, new uses for the technology emerge that stand to improve the quality of life for millions of people. Such is potentially the case for the citizens of Beijing as one of the major projects in the field starts to take shape: an initiative to eliminate some of the city’s dangerous smog to improve the health of residents. IBM has announced that this plan will roll out over the next 10 years, with an emphasis on transforming the way air quality is analyzed.

As big data continues to become a part of our everyday lives, new uses for the technology emerge that stand to improve the quality of life for millions of people.

As Big Data continues to become a part of our everyday lives, new uses for the technology emerge that stand to improve the quality of life for millions of people.

Pollution disrupts professional routines and overall health
The pollution in Beijing has not only reduced the life expectancy of those who live in the heart of the city, but its constant presence prevents citizens from enjoying their daily lives. According to a recent piece from Quartz writer Gwynn Guilford, the Chinese government is tasked with shutting down many of the basic operations of the city, including the closure of schools and factories and limiting the number of cars that can safely drive within city limits when PM2.5 concentrations grow too high.

Here’s where the cloud infrastructure comes in. Because Big Data works best when mass amounts of information are collected and then boiled down to deliver a concise result, IBM intends to use the method to learn more about what pollutes the air around Beijing by monitoring changes in the atmosphere.

“Called ‘Green Horizon,’ the project will focus on air quality management, renewable energy management, and energy optimization among Chinese industries,” Guildford explained. “As part of the initiative, IBM has already signed a partnership with the Beijing government, which is hoping to tap into the company’s expertise to help tackle the city’s air pollution crisis.”

Cloud servers will be used to analyze current air quality in the city and identify potential solutions for alternative energy. Reuters writer David Stanway speculated that the biggest source of pollution is likely still smog from factories and cars, and that IBM can probably use the same Big Data tools that identified the problem to find effective solutions. Possible long-term projects might include solar- and wind-powered installations within the city to reduce energy consumption.

(more…) «How Big Data can Help Reduce Pollution»

Is MapReduce Dead?

Tuesday, July 15th, 2014 by

With the recent announcement by Google of Cloud DataFlow (intended as the successor to MapReduce) and with Cloudera now focusing on Spark for many of its projects, it looks like the days of MapReduce may be numbered. Although the change may seem sudden, it’s been a long time coming. Google wrote the MapReduce white paper 10 years ago, and developers have been using at least one distribution of Hadoop for about 8 years. Users have had ample time to determine the strengths and weaknesses of MapReduce. However, the release of Hadoop 2.0 and YARN clearly indicated that users wanted to live in a more diverse Big Data world.

spark-logo

Earlier versions of Hadoop could be described as MapReduce + HDFS (Hadoop Distributed File System) because that was the paradigm that everything Hadoop revolved around. Because users clamored for interactive access to Hadoop data, the Hive and Pig projects were started. And even though you could write SQL queries with Hive and script in Pig Latin with Pig, under the covers Hadoop was still running MapReduce jobs. That all changed in Hadoop 2.0 with the introduction of YARN. YARN became the resource manager for a Hadoop cluster that broke the dependence between MapReduce and HDFS. Although HDFS still remained as the file system, MapReduce became just another application that can interface with Hadoop through YARN. This change made it possible for other applications to now run on Hadoop through YARN.

Google is not known as a backer in the mold of Hortonworks or Cloudera with the open source Hadoop ecosystem. After all, Google was running its own versions of MapReduce and HDFS (the Google File System) on which these open-source projects are based. Because they are integral parts of Google’s internal applications, Google has the most experience with using these technologies. And although Cloud DataFlow is specifically for use on the Google cloud and appears more like a competitor to Amazon’s Kinesis product, Google is very influential in Big Data circles, so I can see other developers following Google’s lead and leveraging a similar technology in favor of MapReduce.

Although Google’s Cloud DataFlow may have a thought leadership-type impact, Cloudera’s decision to leverage Spark as the standard processing engine for its projects (in particular, Hive) will have a greater impact on open-source Big Data developers. Cloudera has one of the most popular Hadoop distributions on the market and has partnered with Databricks, Intel, MapR, and IBM to work on their Spark integration with Hive. This trend is surprising given Cloudera’s investment in Impala (its SQL query engine), but the company clearly feels that Spark is the future. As little as a year ago, Spark was mostly seen as fast in-memory computing for machine learning algorithms. However with its promotion to an Apache Top-Level Project in February 2014 and its backing company Databricks receiving $33 million in Series B funding, Spark clearly has greater ambitions. The advent of YARN made it much easier to tie Spark to the growing Hadoop ecosystem. Cloudera’s decision to leverage Spark in Hive and other projects makes it even more important to users of the CDH distribution.

spark-stack

(more…) «Is MapReduce Dead?»

Big Data Revolutionizes the Gaming Industry

Thursday, July 10th, 2014 by

There are few technologies that promise to improve as many different industries as Big Data. Whether it’s medicine or the weather in your backyard, the mass aggregation and analyzing of information could result in marked improvement and insight on nearly anything. The cloud computing technology may even change the way we have fun. It has already had an impressive effect on the video gaming industry and will have a great deal of influence on determining what the runaway hits of tomorrow will be. Here’s are a few small but important insights into the world of the gamer.

There are few technologies out there that can stand to improve so many different industries as big data can.

There are few technologies that promise to improve as many different industries as Big Data.

Big Data observes the user learn the game
The emerging technology offers those marketing and developing video game developers more insight than ever into what makes players tick, what makes them happy, and what keeps them engaged. Any game’s success is directly connected to the “addiction” factor – what is it about a certain game that makes users feel they can’t stop playing, and even more important, how can that feeling be monetized? To study this objective further, each activity must be stripped down to individual characters, levels, and other gameplay features to determine what works and what doesn’t.

Qubole writer Gil Allouche wrote a piece on how Big Data can be used to decide how difficult individual levels should be in future incarnations of any given game. A cloud server can track how long it takes each player to finish a level, indicating whether early levels are too simple and need to be beefed up in difficulty or are discouraging new users because they’re too challenging. Mass amounts of data can help narrow down the right decision for an individual game.

Increasing sales on cloud-based consoles
For nearly all current gaming systems, the Internet and cloud hosting have integrated seamlessly to foster more sales as well as engagement between players on massively popular interactive games. By basing the gaming store online with the ability to be accessed on the console itself, gamers are saved a trip to the store and can download a new experience right to their system in real time, giving them less time to question a decision and dive right into a purchase. Big Data also allows companies to better “recommend” similar games and products to the ones a gamer is already enjoying, increasing the likelihood of sealing a sale.

Real-world examples
EA Games is one of the largest video game developers and distributors on the planet, and announced a new commitment to improving its business model and products with the help of Big Data earlier this year. This will give the company a huge technological advantage, especially when it comes to targeting advertising and maximizing player-to-player interaction in major gaming successes like Activision’s “Call of Duty” franchise and EA’s own “Battlefield” franchise. Silicon Angle, a science and technology blog, reported on the gaming company’s major statement.

(more…) «Big Data Revolutionizes the Gaming Industry»