Editor's Note: In this week's Whiteboard Walkthrough, Jim Scott, Director of Enterprise Strategy and Architecture at MapR, walks you through HBase key design with OpenTSDB.
Here is the transcript:
Hi. I'm Jim Scott, Director of Enterprise Strategy and Architecture at MapR. Welcome to this Whiteboard Walkthrough. Today, I'm going to talk to you about HBase key design and I'm going to use OpenTSDB as an example for this.
Now one of the important things to keep in mind with HBase is that it is a linearly-scaling, column-oriented key value store. Now in order to get linearly-scalable functionality out of HBase, you have to be very cognizant of the key design. This means you don't want to create what's called hot spots, and you want to prevent things like sequential writes from occurring. So what I've done is I've pre-drawn this diagram for you to show you that if you were to write sequentially, the keys, what happens in HBase is that when you're writing keys one through five, they're all going to land on the first server. When HBase splits the region, this region is going to now be across two different files. The first file will have the keys one through five and the next file will then have keys six through ten. So while you're writing, you'll end up hot spotting on the second file. Now, as you distribute these files across your HBase cluster, you're going to then have cold spots and hot spots. As you go and regions split, and they continue on, all of these previous regions are now going to be cold. So when you have access to these data sets and you're writing constantly sequential data, you end up always writing to the end because HBase always stores keys in sorted order. The goal is to actually prevent sequential write from key values where they're always in sorted order and always writing at the end.
In this model, we want to take advantage of HBase's linearly-scaling capabilities, which means we have to be very thoughtful about our key design, and we have to be thoughtful about our access patterns. Now, what I'll do is I'll use OpenTSDB as an example. OpenTSDB was created to monitor our data centers, which means it's going to enable you to monitor things like CPU temperature, CPU load, memory utilization, anything in your data center that you care about, even perhaps database read times—all of these types of things. So when you want to access this, you have to be very cognizant of what data you're going at and when. OpenTSDB took an approach of saying the metric is the most important thing that you're always going to access. Now, what this means is that your, say, IO stats for your network bandwidth, are going to be something that you're going to track and you're going to view across your data center. So OpenTSDB will give you an identifier back for this. So every time you query data from HBase, OpenTSDB has an ID for the metric you're looking at. When you come at it, you're always going to come in first with the metric ID.
When monitoring a data center, what's most important typically are time ranges of say, the last fifteen minutes, maybe the last hour, maybe the last day at the long end. Now, this being the case, you next want to come in based on the time of day. So what OpenTSDB did was, they said let's take the base time. Now what the base time is in OpenTSDB is actually the base hour. This base hour is 8:00am, 9:00am, 10:00pm, whatever time of day, to the nearest hour, previous to the time. So if it is 9:45, the base hour is 9:00. So now what happens is, when you come in to OpenTSDB on top of HBase, you come in with the metric and the hour of day. So from an access perspective, any time you come to query the database, if you were looking for a fifteen minute window, you're going to get the 8:00 hour and the 9:00 hour if you happen to hover between the 9:00am range. Say, 8:50 to 9:05, you get two rows back.
Now, to make this more useful from a query perspective, OpenTSDB also added tag keys and tag values. These are repeating. They're always also in sorted order. So the most dominant key that you would expect in a data center would be the server name. The server name equal the server value is what you could expect to see in this type of key layout. So as you're querying HBase, you'll come in with a metric ID, you'll have your base hour, and then you can filter and start slicing data by the tags that you've applied to it.
Now, the important part of this is that the access pattern was defined based on the use case. In data center monitoring, you would likely never go in and look at all metrics for a single host for all the different types of sensors you would be recording. The reason is that it's going to be an unwieldy amount of data, and it's not going to mean anything in the context of time. So as we do this, what we see is that the metric ID, a unique identifier that comes from OpenTSDB that ends up going in randomly across these servers. So as new metrics get put in, the first metric may land here, the second metric may land here. Now, as I continue emitting samples for the first metric, they'll always land here. But as you have more and more sensors, your sensor data will be distributed across your cluster evenly. So there's no real concern about hot spotting with a particular metric, or one metric is written or sampled more frequently than others. What this does is this gives you the ability to then see a nice, even distribution, so your writes are evenly distributed, and then your reads are evenly distributed.
Now, importantly, this is not the end-all, be-all design for a key. What I mean by that is that this key design works fantastic for data center monitoring, because this fits the access pattern for what people are looking for when monitoring a data center. Now, if you have use cases that say are a user profile, right, you have a user visiting your website, you want to keep track of that, you want to keep track of what they've clicked on, what they've seen. You may want to just store that with, in fact, a sequential ID, but what happens is that in this model you really want something more random. One of the approaches you can take is to actually take that unique ID of 1, 2, 3, et cetera and you can apply a hashing algorithm to this. Now that hashing algorithm will hash this into something that will be evenly distributed across your cluster. However, the problem is, hashing over large data sets end up becoming very slow. So while this is possible to do, it's not the best approach. Hashing has been considered for many use cases, but again, your access pattern's coming in to look for it and how many accesses you have where you're doing reads and hashing, are going to be very dependent on your overall throughput and capacity.
Some approaches that can be taken here would be coming up with an alternative design for that user's identifier. It also could be perhaps by byte-shifting. You could also do something like moving the low-ordered bytes over to the left side. That will actually then create an evenly distributed pattern, regardless of the sequential number that you have.
One final example here would be if you wanted to design a key for something like a dimension table concept where you have details and dimensions. In this example, if I had a detail table and I had dimension A and dimension B, what I would do is I would say well dimension A, let's just say, is a product brand. Let's just say dimension B is the color of the product. We would actually create our equivalent to an index table and say, this index table, much like a relational database, could be structured like A_B and then underscore the detail. What this does is our key would be representative of this dimension value. So it could be A1, A2, A3, whatever that key is, and B1, B2, B3 for whatever this key is. By putting them together in this order, we can come in and actually query this index table much like a relational database would hide behind the scenes for you.
In this use case, this then gives you the ability to do things like count over your index table without ever having to go to your detail table to say, how many details exist where dimension A and dimension B are satisfied. This is a great example for concepts like data warehouse offloading, where you have the desire to implement something like a star schema, and it gives you the ability to start moving things slowly over and experiment with your key design and your access patterns for reading that data out.
That's all today for our Whiteboard Walkthrough on HBase key design with OpenTSDB as an example. If you have any questions or comments on this session, implementations that you're trying to accomplish with HBase and getting good distribution for your keys, please comment below. We'd love to hear some feedback on this. If you have any other topics in mind that you'd like us to cover, please let us know. Don't forget to subscribe to this channel on YouTube for more videos coming soon. Please follow us on Twitter @MapR #WhiteboardWalkthrough. Thank you!