The piece points out some challenges faced by companies trying to implement big data initiatives, including deciding what data should be kept or discarded and the IT skills gap that exists in implementing big data efforts.
While the article offered some interesting insights about big data, it included some misconceptions about Hadoop. First, it referred to the Big Data platform as a batch only technology. While plenty of enterprises use it for batch processing, real-time capabilities are now available for Hadoop. For example, the MapR Big Data platform delivers integration with NFS, enabling MapR to support real-time computation workflows.
Second, the article did not distinguish between the types of features that are needed to accelerate adoption and expansion of Hadoop through enterprise capabilities. At MapR, we think that the addition of enterprise-grade capabilities like replacing the Hadoop file system with NFS access and increasing reliability of Hadoop and its performance are crucial to making it a part of an enterprise infrastructure, and not just a science project. As our demonstration with Twitter analysis shows, Hadoop can play many different roles in real time processing, especially when NFS support is brought into play.
Real-time analysis is becoming increasingly important in the enterprise. Our Twitter demonstration during Strata, (which showed in a dynamic display who was tweeting and what topics were being covered), is just one example of the way that the MapR Big Data platform can handle enterprise challenges such as real-time data streams. It’s true enough that standard Hadoop distributions can’t process real-time data, but MapR certainly can.