Charting the Course to Big Data Analytics

IDC analysts Dan Vesset and Michael Versace outlined the way that Progressive Insurance is using big data as a way to get insights into its customers’ driving behaviors in a recent article by Chris Kanaracus in ComputerWorld about the challenges faced in reaching a mature state of big data analytics.

The piece points out some challenges faced by companies trying to implement big data initiatives, including deciding what data should be kept or discarded and the IT skills gap that exists in implementing big data efforts.

While the article offered some interesting insights about big data, it included some misconceptions about Hadoop. First, it referred to the Big Data platform as a batch only technology. While plenty of enterprises use it for batch processing, real-time capabilities are now available for Hadoop. For example, the MapR Big Data platform delivers integration with NFS, enabling MapR to support real-time computation workflows.

Second, the article did not distinguish between the types of features that are needed to accelerate adoption and expansion of Hadoop through enterprise capabilities. At MapR, we think that the addition of enterprise-grade capabilities like replacing the Hadoop file system with NFS access and increasing reliability of Hadoop and its performance are crucial to making it a part of an enterprise infrastructure, and not just a science project. As our demonstration with Twitter analysis shows, Hadoop can play many different roles in real time processing, especially when NFS support is brought into play.

Real-time analysis is becoming increasingly important in the enterprise. Our Twitter demonstration during Strata, (which showed in a dynamic display who was tweeting and what topics were being covered), is just one example of the way that the MapR Big Data platform can handle enterprise challenges such as real-time data streams. It’s true enough that standard Hadoop distributions can’t process real-time data, but MapR certainly can.

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams

 

 

 

Download for free