We at MapR want to congratulate the Apache Myriad team today on announcing version 0.1 of the Myriad project, the first community release. Myriad will soon be officially in the MapR Distribution, which reveals how we’ve been thinking of the future of the data center, and particularly how big data services fit in. In this blog post, I’ll share how we see Myriad delivering value to customers, and how it fits in with the MapR platform.
Myriad, combined with the MapR platform, addresses a key issue we’ve been hearing from our customers—the explosion of silos. Silos can take many forms, including:
- Application silos - Companies often dedicate servers for a particular application, like an Apache web server farm, a Jenkins continuous integration system, or a Hadoop cluster.
- Service silos - Each of these applications may need a variety of data services, from file storage to database to messaging, each potentially requiring a dedicated environment.
- Organizational silos - Distinct user groups, often defined by business units or departments, try to maintain tight controls over their own applications, services, and data.
- Temperature silos - Companies often deploy multiple storage tiers with different price points and performance capabilities to save costs when data starts going cold.
Why is this a problem? For starters, it isn’t very efficient. Each silo must be sized based on peak load, not average load, meaning that if you’re a retailer, you need to allocate enough resources to your web server farm to handle Black Friday sales or any other sales spike throughout the year. The next reason is less obvious but even more of a headache—silos trap data, which leads to fragmentation of data across the enterprise. This forces companies to build complex data movement processes to get data from the silos it is generated in, like the web server farm, to the silos where it is processed, like Hadoop.
MapR, Myriad, and Mesos together avoid the silos in a few ways:
- Mesos provides a data center resource scheduler that allows a variety of applications to be deployed in a single environment, sharing compute resources. Myriad plugs into Hadoop, allowing it to negotiate and share resources with Mesos. Before Myriad, Hadoop applications required a separate cluster from the rest of your data center. With Myriad, Hadoop applications can run side by side with all other applications, eliminating the application silos.
- Myriad allows multiple logical Hadoop clusters to be provisioned in a single environment.
- MapR provides a universally-available, secure, multi-tier data layer that underlies the entire application environment. This eliminates data silos by allowing data to be created, shared, processed, and migrated between tiers in one namespace, eliminating data silos and complex data movement pipelines.
Moving forward, we’ll continue working with the Apache Myriad community on adding features and stability. Look for another post from us soon announcing inclusion in the MapR Distribution.