MapR Control System Part 2: Setting up Volumes, Snapshots and Mirrors

Overview

The MapR Control System (MCS) is a graphical, programmatic control panel for cluster administration that provides complete cluster monitoring functionality and most of the functionality of the command line. This is Part 2 of the three-part series on MCS tutorials that talks about setting up Volumes, Snapshots and Mirrors using MCS.

Use the tutorials to perform the following operations in the MCS:

Part 1 (Click here)

  • Explore the Dashboard view
  • Set up topology

Part 2 (This tutorial)

  • Create volumes
  • Take snapshots
  • Create mirror volumes

Part 3 (Click here)

  • Configure notifications and alarms
  • Review job metrics

Volumes

A volume is a logical unit that you create to organize data into groups to manage your data and apply policy all at once instead of file by file. The volume structure defines how data is distributed across the nodes in your cluster.

You can create volumes for each user, department, or project. Volumes can enforce disk usage limits, set replication levels, establish ownership and accountability, and measure the cost generated by different projects or departments.

Configure volumes as soon as you can after getting your cluster up and running. Putting all your data in the cluster without organizing it into volumes can lead to headaches later. It is important to create many volumes for data storage and to select your choice of volumes strategically for management. Volumes are easily created, named, and their mount path designated from the MCS.

Volumes empower the following data management features that MapR provides:

  • Volume topology lets you specify a subset of cluster nodes that a volume is allowed to use, for data placement (see Setting Volume Topology).
  • Snapshots let you preserve the state of a volume at a particular point in time (see Snapshots).
  • Mirrors let you create read-only copies of a volume for load-balancing, separation of development from production, or backup (see Mirror Volumes).

A MapR cluster comes with certain system volumes out of the box. The following diagram shows the system volumes (blue) along with recommended volumes that you should add to your new cluster.


The root volume (mapr.cluster.root, mounted at /) contains the mount points for the other volumes. MapR provides a volume for HBase (if installed) and a /var/mapr volume containing information about cluster configuration. There is also a local volume for each node - limited by its topology to reside only on its own node.

As shown in the example above, you should add a hierarchy of volumes for users, projects and departments, to enable you to manage data for these different entities separately.

Create a Volume

  1. Click Volumes in the Navigation panel.
  2. Click New Volume.

  3. Specify volume settings:
    • Volume Setup: Set the name and mount path of the volume. The mount path determines where the volume will be mounted. Following the above volume layout diagram, you might create a volume called johnsmith with a mount path of /users/jsmith for example. You can also set volume topology here (default is /data of course, to use all racks), and choose whether to create a normal read/write volume or a mirror volume.
    • Permissions: Set the permissions, for each user, for volume operations such as backing up or deleting the volume.
    • Usage Tracking: Set a quota, if desired, to limit the maximum size of the volume. The hard quota is a limit above which writes to the volume are disabled; the advisory quota is a limit above which a warning is sent to the volume's owner.
    • Replication: Set the desired replication and the replication method for the volume.
For more information on what these settings mean, see Managing Data with Volumes.

Snapshots

A snapshot is a read-only image of a volume at a specific point in time. Snapshots are useful any time you need to roll back to a known good data set at a specific point in time. You can create a snapshot manually or automate the process with a schedule. If you want to automate the snapshot with a schedule, configure schedule details first.

Create a snapshot manually:

  1. In the Navigation pane, expand the MapR-FS group and click the Volumes view.
  2. Select the checkbox beside the name of each volume for which you want a snapshot, then click the New Snapshot button to display the Snapshot Name dialog.
  3. Type a name for the new snapshot in the Name... field.
  4. Click OK to create the snapshot.

Create a snapshot schedule:

  1. In the Navigation pane, expand the MapR-FS group and click the Schedules view.
  2. Click New Schedule.
  3. Type a name for the new schedule in the Schedule Name field.
  4. Define one or more schedule rules in the Schedule Rules section:
    a. From the first dropdown menu, select a frequency (Once, Yearly, Monthly, etc.)
    b. From the next dropdown menu, select a time point within the specified frequency. For example: if you selected Monthly in the first dropdown menu, select the day of the month in the second dropdown menu.
    c. Continue with each dropdown menu, proceeding to the right, to specify the time at which the scheduled action is to occur.
  5. Use the Retain For field to specify how long the data is to be preserved. For example: if the schedule is attached to a volume for creating snapshots, the Retain For field specifies how far after creation the snapshot expiration date is set.
  6. Click [ + Add Rule ] to specify additional schedule rules, as desired.
  7. Click Save Schedule to create the schedule.

Schedule a snapshot:

  1. In the Navigation pane, expand the MapR-FS group and click the Volumes view.
  2. Display the Volume Properties dialog by clicking the volume name, or by selecting the checkbox beside the name of the volume and then clicking the Properties button.
  3. In the Snapshot Scheduling section, choose a schedule from the Snapshot Schedule dropdown menu.
  4. Click OK to save changes to the volume.

Mirror Volumes

A mirror volume is a read-only physical copy of a source volume. You can use mirror volumes in the same cluster (local mirroring) to provide local load balancing. Local mirror volumes can serve read requests for the most frequently accessed data in the cluster. You can also mirror volumes on a separate cluster (remote mirroring) for backup and disaster readiness purposes.

Create a local mirror volume:

  1. In the navigation pane, select MapR-FS > Volumes.
  2. Click the New Volume button.
  3. In the New Volume dialog, specify the following values:
    • Select Local Mirror Volume.
    • Enter a name for the mirror volume in the Mirror Name field. If the mirror is on the same cluster as the source volume, the source and mirror volumes must have different names.
    • Enter the source volume name (not mount point) in the Source Volume Name field.
  4. (Optional) To automate mirroring, select a schedule corresponding to critical data, important data, normal data, or a user-defined schedule from the Mirror Schedule dropdown menu.

Create a remote mirror volume:

  1. In the navigation pane, select MapR-FS > Volumes.
  2. Click the New Volume button.
  3. In the New Volume dialog, specify the following values:
  4. Select Local Mirror Volume or Remote Mirror Volume.
  5. Enter a name for the mirror volume in the Volume Name field. If the mirror is on the same cluster as the source volume, the source and mirror volumes must have different names.
  6. Enter the source volume name (not mount point) in the Source Volume field.
  7. Enter the source cluster name in the Source Cluster field.
  8. To automate mirroring, select a schedule from the Mirror Update Schedule dropdown menu.
Tutorial Category Reference: