In most clusters, a small number of nodes runs a set of control services devoted to cluster management and Hadoop infrastructure:
- CLDB
- JobTracker
- WebServer
- Zookeeper
The remainder of the nodes are devoted to services related to data processing and storage:
- FileServer
- TaskTracker
Supplementary services can run on many or few nodes, depending on how the cluster is to be used. Examples:
- NFS
- HBase
The following table provides general guidelines for the number of instances of each service to run in a cluster:
| Service | Package | How Many |
|---|---|---|
| CLDB | mapr-cldb | 1-3 |
| FileServer | mapr-fileserver | Most or all nodes |
| HBase Master | mapr-hbase-master | 1-3 |
| HBase RegionServer | mapr-hbase-regionserver | Varies |
| JobTracker | mapr-jobtracker | 1-3 |
| NFS | mapr-nfs | Varies |
| TaskTracker | mapr-tasktracker | Most or all nodes |
| WebServer | mapr-webserver | One or more |
| Zookeeper | mapr-zookeeper | 1, 3, 5, or a higher odd number |
Sample Configurations
The following sections describe a few typical ways to deploy a MapR cluster.
Small M3 Cluster
A small M3 cluster runs most control services on only one node (except for ZooKeeper, which runs on three) and data services on the remaining nodes. The M3 license does not permit failover or high availability, and only allows one running CLDB.

Small M5 Cluster
A small M5 cluster runs control services on three nodes and data services on the remaining nodes, providing failover and high availability for all critical services.

Larger M5 Cluster
A large cluster (over 100 nodes) should isolate CLDB nodes from the TaskTracker and NFS nodes.

| In large clusters, you should not run TaskTracker and ZooKeeper together on any nodes. |
Example
Unable to render embedded object: File (RackWorksheetLarger.png) not found.