I recently joined MapR to lead the Product Management group. Since I used to consult with MapR four years ago (before the company was launched), it is natural that everyone wants to know, "What's different four years later?"
The big answer is that the promise of Hadoop has turned into a reality for a wide array of situations, and the impact is really meaningful. Here are some first impressions after working at MapR for a few weeks. Note: I have to say up front that this is not meant to be a dispassionate research piece; It is just top of mind impressions.
It has quickly come down to just three major players with Hadoop distributions. If you look at writeups about the Hadoop space from just this month, you will see that there are clearly very different strategies in play. It comes down to a technology/product-focused play vs. a services-based strategy. MapR is the only company that has continued since day one to bet on furthering the technology of this space with real enterprise-grade products.
The scope and diversity of use cases has increased. More people are deploying Hadoop in more industries and more situations successfully. There is no one use case that is the obvious “first thing to go do.” The closest use case to that would be data warehouse offloading in order to: a) reduce cost of storage radically, and b) make the data available for use with big data/Hadoop technologies.
Customers get faster payback, and then want to go big really fast! "Go big" in this case often means rapidly expanding to more use cases. What is great about this trend is that it causes a positive virtuous cycle as confidence builds around the technology.
Big data/Hadoop is truly transformational for many companies. For many customers, big data is not an experiment, but is at the core of operations for their business. They couldn't function well without it. Of course, these customers also are reluctant to share many details about their deployments because it is such a competitive advantage for them.
There is clearly a need for expanding the expertise base in using big data techniques. Organizations are trying to figure out how to get more of their staff up to speed. This need is not just about “data scientists” or “IT skills.” Rather, this need relates to a broad set of interconnected work that affects the business user, the BI skilled staff, the data scientist, and the IT team. One of the most interesting innovations in this space is the Apache Drill project, which radically changes how data can be explored using the familiar SQL paradigm and applying it to enterprise-grade NoSQL databases. Check out this demo of an online retailer use case with Apache Drill. If you’d like to explore Drill further, try the MapR Sandbox with Drill, install Drill on your MapR cluster, or download Drill on your laptop—you’ll find all the relevant links on our Apache Drill page.
There is a growing need for real-time data. Batch processing won't go away, but the new data lakes and data hubs that are being created are enabling customers to ingest, process and get insights in real time. Terms like Operational Analytics are starting to be created for this space. It turns out that the core technology innovations from MapR happen to be exceptionally well-suited to this type of in-the-moment processing.
What has not changed is the core MapR strategy from its founding days. The lesson is that sticking to your “game” over a sustained period produces great results. In the case of MapR, the results are a much better product that enables customers to drive more value faster from big data.
At MapR, we’ve seen Hadoop evolve on a straight trajectory. We’ve worked hard to add unique architectural enhancements so that you have a production-ready data platform that will meet the demands of your business. How has Hadoop evolved within your organization in the past four years? Tell us in the comments section below.