Working with the Open JSON Application Interface – Whiteboard Walkthrough

In this week's Whiteboard Walkthrough, Aditya Kishore, engineer on the MapR-DB team, explains how to use the OJAI API to insert, search, and update the document database.

Here's the undedited transcription: 

Hello and welcome to the Whiteboard Walkthrough series. My name is Aditya Kishore and I'm a software engineer in the MapR-DB team. In today's episode we are going to talk about the OJAI API and the support in MapR-DB. Before we jump into the details of the API, let me talk briefly about MapR-DB. MapR-DB is a fast, highly scalable enterprise-grade NoSQL database, which is inspired by Google's white paper on BigTable. Since its first release, MapR-DB supports HBase API, which allows the application to store and retrieve data in form of a two level map with the keys and value of the map being uninterpreted sequence of bytes.

It's a fairly simple model for the application in which it allows the application to have full control as well as responsibility to decide how to encode applications’ natural data model, and these sequence of bytes. The server does not have any idea about what kind of data is being stored. It cannot provide any kind of help in terms of query optimization when the data is queried back by the application. 

Lastly, in the discussion with our users some time ago, we realized that most of the big data applications are moving towards semi-structured or unstructured data as their primary data model. Thus, the OJAI project was born. OJAI API is essentially defined as a set of interfaces, which allows the application to manipulate structured, semi-structured, or unstructured data, and also interchange this data across different systems. For example, you can have the result of a Drill query streamed directly into MapR-DB without any sort of intermediate encoding.

Let's take a look at some of the examples on how you can use the OJAI API to store and retrieve documents in MapR-DB. The first example, as we see, how do you insert new, fresh documents into MapR-DB? You start with getting a handle to a table where you want to store the document. A table in MapR-DB is a document stored in the OJAI API. You start by assembling a document by supplying all the attributes, and then simply call insertOrReplace API on the table and that sends the object to MapR-DB.

Let's look at once you have stored the set of documents, how would you retrieve it back. MapR-DB tables provide a set of query APIs along with the query condition interface from the OJAI library. You basically specify set of criteria that you want to run as search against the stored table, and then run this find method and you get a stream of document which matches the criteria that you have specified. One thing that I would like to point out is that since the DB understands the structure and the type of the data that you have stored, all this filtering and query processing happens on the server side and the client gets only the filtered result back. This makes the query processing fast and efficient.

Now, moving to the third example, which is how do you update a stored object. It's similar to how you store the original object but use a different interface called DocumentMutation. Here you just update or add the fields that you want to modify, and then call the update method on the table, and this sends the mutation back to the server. One thing that I would like to again point out that this mutation will only update the field that you are changing, and it does not require the whole documents to be reprocessed again.

Now there are a set of libraries included in the OJAI library that allows you to create a stream of documents from JSON files, which might be stored on a file system like MapR-FS. With three lines of code, you can transmit all the documents which are stored in JSON format into a table in MapR-DB, so you start getting the handle and then a stream from the JSON file, and then you can use the insertOrReplace API on the table to stream entire document sets from the JSON files into the DB.

This is essentially a short collection of examples that I had today. For more examples and documentation on how to use the OJAI API with MapR-DB, please visit us at www.maprdb.io, and I also welcome you to join us on the OJAI project at GitHub.com/OJAI for participation. Thank you!

no

CTA_Inside

Driving The Next Generation Data Architecture
This paper examines the emergence of Hadoop as an operational data platform, and how complementary data strategies and year-over-year adoption can accelerate consolidation and realize business value.

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams

 

 

 

Download for free