In this week's Whiteboard Walkthrough, Santosh Marella, committer on the Apache Myriad project, explains how Apache Myriad enables fine-grained scaling in Mesos environments alongside YARN, the resource management framework for Apache Hadoop.
Here's the unedited transcription:
Hi, my name is Santosh Marella. I am an Apache Myriad committer. In this whiteboard walkthrough, we are going to discuss Apache Myriad and specifically a feature called fine-grained scaling.
Apache Myriad is a project which aims to have Hadoop jobs run inside the same physical infrastructure that's managed by Mesos. Typically in a data center, you want to run multiple workloads. Hadoop is one of the workloads that you would want to run inside a data center.
Today, if you have a data center that's managed by Mesos, you would want to run multiple frameworks like Marathon, Chronos, Jenkins, and tons of other frameworks. What Myriad is trying to do is to run YARN as a framework for Mesos so all of the Hadoop workloads can also run inside the same infrastructure.
Let's look at the various components in Myriad. To begin with, let's start with a simple Mesos cluster, and I'll walk you through the components of both YARN and the Myriad components here. Let's start with the Mesos cluster: let’s say you have a Mesos master and you have two Mesos slaves running. We start with the resource manager process, which is the master in the YARN cluster. The resource manager process has a component called Myriad scheduler that runs inside this as a plug-in. Myriad scheduler is the second level scheduler that's part of any Mesos framework.
The first level of scheduling happens in the Mesos mastering layer and the second level of scheduling happens here. The Myriad schedule also exposes the REST API. This is for the admins to control the number of node managers to be launched, the sizes of those node managers, or even to shut down those node managers.
Once you have a resource manager launched in one of the Mesos slave nodes, it can register with the Mesos master as a framework. The second level of scheduling can start happening from that part on. Once the Myriad scheduler is registered with Mesos master, it keeps receiving the Mesos offers. An offer means a certain slave in the cluster has a certain amount of resources available.
Once we receive these offers, let's say the admin says he wants to launch a node manager. Those offers can be utilized to launch node managers. In this whiteboard walkthrough, we are going to focus on one special node manager. We start with the launching a node manager of a very small size. We consume enough resources to just start the node manager process.
We started the node manager and made an executor component. Both of them together would take something like 2 gig and 1 CPU of resources. Once the node manager is launched, it registers back with the resource manager. It doesn't have capacity to run any of the YARN containers.
Let's say you have a job that is submitted to the resource manager, a YARN job that is submitted. The resource manager now tries to perform its scheduling. It sees there is a node manager running somewhere, but that node manager doesn't have the capacity to launch any containers. Assume that that this node is pretty thinly occupied; it has some capacity available. Mesos can offer resources from this node to Myriad and the schedule inside the resource manager receives these offers.
What Myriad scheduler does is it projects these offers as capacity available on that node manager. The resource manager can set things that there is capacity available on this node manager. It utilizes that capacity to schedule containers for YARN.
Once we tell Mesos that we are going to use some resources, Mesos is going to expand the capacity of the whole node manager process to accommodate the containers that are going to be launched by YARN.
When the node manager heartbeats to the resource manager, you get some containers allocated to here. By that time, we have expanded the capacity of the node manager. These containers can run without being killed by Mesos. Once these containers finish, the Myriad Executor component can listen to the statuses of these containers, and once they finish, it can actually report back to Mesos that these containers have finished. At this point, the resources occupied by these individual containers are given back to Mesos.
Just to clarify, this happens so dynamically that once the resources are given back to Mesos, the first level of scheduling at Mesos kicks in and it can actually figure out whether Myriad should receive these resources again or some other some other framework running in Mesos cluster should receive these resources again.
In the ideal scenario, when there are no Hadoop jobs running, we have very limited amount of resources taken by Myriad or YARN. If there are Hadoop jobs that continuously submitted to the cluster, then this model can basically help you to run those jobs. Once those jobs finish, you have the capacity available to run other Mesos frameworks.
That's all for today. Thanks for watching and please feel free to add any comments or questions. Thanks.