Get Real with Hadoop: Unbiased Open Source

In this blog series, we’re showcasing the top 10 reasons customers choose the MapR Distribution including Hadoop to optimize their data-driven strategies. Reason #8: MapR provides Unbiased Open Source supporting 20+ OSS projects.

MapR provides the most flexible application development platform for Hadoop users. This platform flexibility addresses critical developer needs, including technology choice and user-driven deployment and maintenance. Let’s look at these advantages in detail.

Flexible Application Development Platform
Given the rapid pace at which the Hadoop ecosystem is evolving and also the variety of applications that can be developed around big data, you need a platform that can keep up with this rapid change. You do not want to be locked in to a specific technology stack whether you are in early-stage development, or in the expansion stage of your use cases.

To that end, much like the Linux operating system distributions that ship and support many standard and "competing" open source software packages (e.g., PostgreSQL and the MySQL database), the MapR Distribution provides customers with the maximum freedom of choice in Apache and other OSS projects. MapR is the broadest Hadoop distribution with over 20 different open source projects that are tested, hardened, and packaged to ensure production success. Today, MapR supports multiple frameworks including MapReduce v1, YARN, Spark, several SQL-on-Hadoop technologies (Apache Hive, Apache Drill, Apache Spark SQL, and Impala), and multiple options for machine learning, streaming applications, and in-Hadoop NoSQL databases.

MapR provides monthly updates to these open source packages, ensuring that they are always up to date. We will work closely with you to help you pick the right tool for the job; we’ll help you take advantage of the latest open source developments, and we’ll make sure that you only implement software that is truly production-ready for your use case.

User-driven Production Deployment and Maintenance
Another critical need for users, especially those who have deployed Hadoop applications in production, is the flexibility to continue using specific versions of an open source package while upgrading other parts of Hadoop. You should be able to upgrade to the latest versions of the packages only when you are ready—not when the community has moved on.

In this regard, MapR uniquely supports multiple versions of different packages, enabling backward compatibility. This feature does not exist with other distributions where only one version of the package is supported at any given point in time, forcing users to upgrade and requiring the existing applications to be recompiled or even rewritten. With MapR however, you can upgrade core Hadoop packages without upgrading ecosystem packages, or you can upgrade ecosystem packages without upgrading the core. This also means that you can perform rolling upgrades to your Hadoop platform without requiring applications to be recompiled.

Multiple version support also allows you to deploy efficient, cross-departmental Hadoop clusters. For example, MapR makes it possible for you to run Hive 0.11, 0.12, and 0.13 in the same cluster at the same time, allowing different departments in enterprises to use these packages for application development and/or migrate to newer versions at their own pace. The same holds true for MapReduce v1 and YARN applications, where a single cluster that has already upgraded to YARN can still work with pre-YARN applications. This means that customers can upgrade to YARN at their own pace gradually over time. We have customers that run 1000s of MapReduce jobs on their clusters every day where this level of migration planning is required.

Best Option for Production Use
By combining the power of over 20 open source projects, providing monthly updates to the latest versions of the packages, and ensuring backward compatibility across packages, MapR provides you with the best options to deploy production-ready software.

The MapR model for a Hadoop distribution has proven tremendous value for our customers. You can think of the MapR model as an open core distribution that comes with companion products for administration as well as an enhanced foundation. To learn more about the different open source models for Hadoop and the MapR advantage, read the CITO Research white paper: Putting Hadoop to Work the Right Way.

And get the complete top 10 list here.


Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams




Download for free