Understanding Memory Utilization on MapR Cluster Nodes

There is much more to memory calculations than just the “used” and “free” states. Here’s a quick primer on understanding memory utilization as reported by the MapR framework.
 
The MapR framework shows memory utilization on nodes based on two values reported by the memory manager. To view the values we use as the basis for memory utilization shown in our GUI/CLI, run the “free” command:
 
aeng@mynode ~$ free      
             
 Total used    free   shared  buffer cached   
 Mem: 65980752 39015488 26965264 0 336352 12910612
             
 -/+ buffers/cache: 25768524 40212228      
             
 Swap: 64434300 0 64434300      
             
 aeng@mynode ~$          
 
In this example, you can see that this machine has about 64GB of memory, and the memory manager reports that 39GB (59%)is currently being used. However, there are more states for memory than just "used" and "free."Multiple layers are built into memory management, so the amount of memory currently "free" as shown in the output above is not necessarily the amount of memory that a process can use if it needs it. The example above shows there is 26GB of free memory, as well as 40GB of free buffers/cache. That adds up to >64GB, so clearly there is more to memory calculations than just the used/free states.
 
For example, say I have an application which requests 10GB from the memory manager and then fills it up. Later in the application’s execution, it frees 8GB of that memory. From the perspective of the application, it is only using 2GB of memory, but from the perspective of the memory manager, it may still be using10GB. It’s important to understand that just because an application "frees" some memory, it doesn't mean that the memory manager immediately completes the steps necessary to make that memory available to other processes.
 
A simple way to illustrate this situation is to think of the process of deleting a file. The first step in deleting afile may be to remove an entry from the parent directory so that users won’t see the file anymore. However, until the actual disk blocks are unallocated, that disk space won't be available for other files.
 
This illustration can be applied tocluster node memory management to a certain extent. If the memory utilization reported by the MapR framework looks odd, log into the node and run “free”tosee how the memory is being used.The goal of the MapR framework is to aggregate useful information, but since the memory manager/allocationsare complex, we can't show all of the detail in a simple “% used” display in our GUI. You can always log into the node and check whether the memory is truly used or if it just hasn’t been fully freed. You can also check per-process memory utilization using the command "ps aux" to see how much memory each process is currently consuming.
 
The MapR framework is designed to put basic memory utilization information right at your fingertips. However, to truly understand the details of how memory is being used on a particular node, you’ll need to dig a little deeper by 1) logging into the node to verify that the memory has been freed and 2) checking per-process memory utilization.
 

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams

 

 

 

Download for free