MapR: Delivering on the Promise of Enterprise-Grade Hadoop

Apache HBase is a NoSQL database solution for large key-value based data sets that provides scale and strong consistency, combined with MapReduce functionality over Hadoop. About half of Hadoop users today deploy Apache HBase for their NoSQL operations.

HBase,1 however, has not reached its true adoption potential because of a complex multilayered architecture. HBase stores its data in the Hadoop Distributed File System (HDFS) running over a java virtual machine that in turn stores its data in the Linux file system (ext3), which then finally writes data to the disk. HDFS is a write-once file system and not well suited to support random I/O database operations.

HBase tries to overcome this limitation with an architecture that involves Region Servers, which interact with several other overlapping software components, manual operations such as region splits and intermittent I/O storms. This complexity leads to several points of failure within the system creating administrative challenges and performance inconsistencies.

MapR attempts to address these issues by leveraging its innovation that delivers random read-write functionality to Hadoop. MapR M7, the new edition from MapR, has a simplified architecture written in C++ that unifies tables and files onto one platform with no intermediate layers. There are no RegionServers, manual splits or compactions in M7.  This means that HBase applications can now recover in seconds, leverage enterprise grade features such as snapshots and mirroring and benefit from better performance and scalability.

Visit for more information on M7.

1. Apache HBase, HBase are trademarks of the Apache Software Foundation.


Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams




Download for free