Architectural Advantages Widen Innovation Gap

MapR’s competitors have been presenting “futures” to prospects since the day MapR came out of stealth. For the most part, the “futures” that were presented in 2011 are still futures in 2013.

Merv Adrian writes about this in his blog post That Exciting New Stuff? Yeah… Wait till it Ships, where he stresses that comparisons should be done based on what’s actually available and makes the point that the company that’s in the lead won’t stand still. Adrian suggests that although the trailers will catch up with the leader’s current position at some point, the leader will have moved forward by then.

With that in mind, I believe an important question that needs to be asked is whether the trailers have the architecture (i.e., foundation) in place to allow them to implement the missing functionality, because that needs to be true in order for the difference to “erode over time,” as Adrian suggests.

For example, in the case of HDFS, it would be practically impossible for HDFS to support random writes/POSIX. It won’t happen in one year, or even in three years. It’s not even on the roadmap. Not because companies don’t need it (POSIX is one of the most popular features in MapR), but because the architecture of HDFS makes it impossible. Similarly, the architecture of HDFS prevents consistent snapshots, due to the separation of metadata and data in HDFS (metadata is on the NameNode, data is on the DataNodes, so applications have to be designed to be snapshot-aware).

In some cases, architectural advantages actually allow the leader to innovate at a growing rate, meaning the gap is not only maintained, but can actually increase dramatically. For example, MapR recently released its M7 edition, enabling 24×7 HBase applications by eliminating compactions and enabling seamless splits and instant recovery. This was only possible due to the underlying architectural advantages that MapR enjoys. In other words, MapR is able to deliver functionality that its competitors cannot even begin to work on, and that is due to the architectural advantage.

It’s also important to keep in mind that no single company, whether it’s MapR, Cloudera or Hortonworks, can compete with the open source community. There are hundreds of open source projects, libraries and applications. Therefore, any innovation by a Hadoop vendor must be done in a way that enables customers to also enjoy all the innovation that takes place in the open source community. For example, the MapR distribution includes over a dozen open source Apache-licensed projects (Hive, Stinger, Pig, Oozie, Flume, Sqoop, Cascading, ZooKeeper, HBase, etc.). The open source community continues to innovate with new and exciting projects, such as YARN, Apache Drill and Apache Tez, and customers obviously want to enjoy the functionality offered by these projects. Therefore, Hadoop vendors like MapR, Cloudera and Hortonworks must ensure that their innovation does not prevent customers from benefiting from the ongoing innovation of the open source community in which they all operate.

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams




Download for free