In addition to looking at whether some innovation is “distribution specific”, it’s important to consider whether that innovation introduces vendor lock-in. To the extent possible, Hadoop distribution providers should aim to deliver unique innovation below common/standard APIs. When that happens, the distribution provides more (ie, unique) value to customers and at the same time does not introduce vendor lock-in.
For example, at MapR we were able to innovate below the HDFS and HBase APIs, thus providing more value to customers while maintaining customers’ ability to easily migrate between MapR and other distributions (with no code changes or even recompilation). When we started the Apache Drill project to provide the next-generation SQL-in-Hadoop technology, it was clear that the only way to avoid vendor lock-in was to develop this technology as a community-driven, Apache project that could be used with any Hadoop distribution.
It will be interesting to see if Merv’s idea catches on.