Getting Started with HBase Shell

The objective of this lab is to get you started with the HBase shell and perform CRUD operations to create a HBase Table, put data into the table, retrieve data from the table and delete data from the table. This lab also gives a brief intro into MapR Control System (MCS) and we’ll see how to create a HBase (MapR-DB) Table and add ColumnFamilies using MCS.

Exercise Required Duration
1: Perform CRUD operations with HBase shell Yes 20 min
2: Create a MapR-DB Table using MapR Control System (MCS) Yes 20 min

Background information on HBase Shell

HBase is a NoSQL, distributed database, which provides random, real-time read/write access to very large data. See the references on HBase for more information.

The HBase Shell is a ruby script that helps in interacting with the HBase system using a command line interface. This shell supports creating, deleting and altering tables and also performing other operations like inserting, listing, deleting data and to interact with HBase. You can get help with the shell commands here:
https://learnhbase.wordpress.com/2013/03/02/hbase-shell-commands/

Here are some reference notes on using the HBase shell:

To start the shell at the linux command line type:

$ hbase shell
// to get help on commands
hbase> help
// use create to create a new table
hbase> create ’/path/tablename’, {NAME =>’cfname’}, {NAME =>’cfname’}  
//Use ‘put’ to insert data into the table
hbase> put ’/path/tablename’, ’rowkey’, ’cfname:colname’, ‘value’
//To get row data with rowkey	
hbase> get ’/path/tablename’, ’rowkey’		
hbase> scan ’/path/tablename’
hbase> describe ’/path/mytable’ 

Connecting to the Sandbox or MapR cluster

Accessing MCS:
https:// hostipaddress:
user: user01 ...
password: mapr

Use putty on Windows

Login into remote host via putty
user01@hostipaddress
password: mapr

Use ssh on Mac

ssh –i user01@hostipaddress

Exercise 1: Perform CRUD operations with HBase shell

The goal of this exercise is to create a table ‘customer’ with the data for Column Families address, order and columns city, state for as shown below using HBase shell commands.

Table: customer
Row-key userid address order
  city

state

date

number

ismith nashville

tn

01/01/2015

12345

biones miami

fl

02/02/2015

56565

 

Use HBase shell

$ hbase shell

Once connected to the cluster and having started the HBase shell, explore the commands that you can perform: like ‘help’, ‘help “put”’ etc…

Note: if you are using a cluster, change user01 to your user, a text file with all of the commands is provided for your convenience in the LAB FILES directory.

Create a table in your home directory

create '/user/user01/customer', {NAME=>'addr'}, {NAME=>'order'}

Use ‘describe’ to get the description of the table.

describe '/user/user01/customer'

Execute the following statements to insert and get the records
Put some data into the table

put '/user/user01/customer', 'jsmith', 'addr:city', 'nashville'

Use ‘get’ to retrieve the data for ‘jsmith’

get '/user/user01/customer', 'jsmith'

Put more data into the table

put   '/user/user01/customer',  'jsmith',  'addr:state', 'TN'
put   '/user/user01/customer',  'jsmith',  'order:numb', '1234'
put   '/user/user01/customer',  'jsmith',  'order:date', '10-18-2014'

Use get to retrieve the data for ‘jsmith’

get '/user/user01/customer', 'jsmith'

Note that this gets all the data for the row. How can we limit this to only one column family ?

get '/user/user01/customer', 'jsmith', {COLUMNS=>['addr']}

How can we limit this to a specific column?

get '/user/user01/customer', 'jsmith', {COLUMNS=>['order:numb']}

Alter table to store more versions in the order column family

alter '/user/user01/customer' , NAME => 'order', VERSIONS => 5

Use ‘describe’ to get the description of the table.

describe '/user/user01/customer'

put more order numbers

put   '/user/user01/customer',  'jsmith',  'order:numb', '1235'
put   '/user/user01/customer',  'jsmith',  'order:numb', '1236'
put   '/user/user01/customer',  'jsmith',  'order:numb', '1237'
put   '/user/user01/customer',  'jsmith',  'order:numb', '1238'

Get order number column cells

get '/user/user01/customer', 'jsmith', {COLUMNS=>['order:numb']}

Note that you are getting the data for only one version per cell. How can you get more versions?

get '/user/user01/customer', 'jsmith', {COLUMNS=>['order:numb'], VERSIONS => 5}

put more data for different rowkey userids

put   '/user/user01/customer',  'njones',  'addr:city', 'miami'
put   '/user/user01/customer',  'njones',  'addr:state', 'FL'
put   '/user/user01/customer',  'njones',  'order:numb', '5555'
put   '/user/user01/customer',  'tsimmons',  'addr:city', 'dallas'
put   '/user/user01/customer',  'tsimmons',  'addr:state', 'TX'
put   '/user/user01/customer',  'jsmith',  'addr:city', 'denver'
put   '/user/user01/customer',  'jsmith',  'addr:state', 'CO'
put   '/user/user01/customer',  'jsmith',  'order:numb', '6666'
put   '/user/user01/customer',  'njones',  'addr:state', 'TX'
put   '/user/user01/customer',  'amiller', 'addr:state', 'TX'

Use ‘scan’ to retrieve rows of data for the table

retrieve all rows, all columns

scan '/user/user01/customer'

retrieve all rows, addr column family

scan '/user/user01/customer', {COLUMNS=>['addr']}

retrieve all rows for order number column, 5 versions

scan '/user/user01/customer', {COLUMNS=>['order:numb'], VERSIONS => 5}

retrieve rows with rowkey starting with 'njo', addr column family

scan '/user/user01/customer', {STARTROW => 'njo', COLUMNS=>['addr'] }

retrieve rows with rowkey starting with 'j', stop before 't'

scan '/user/user01/customer', { STARTROW => 'j', STOPROW => 't'}

retrieve rows with rowkey starting with 'a'

scan '/user/user01/customer', { STARTROW => 'a'}

retrieve rows with rowkey starting with 'a'

scan '/user/user01/customer', { STARTROW => 't'}>

Use ‘count’ to retrieve the number of rows in the table.

count '/user/user01/customer'

Delete data from the table.

get '/user/user01/customer', 'njones'

delete a column

delete '/user/user01/customer', 'njones', 'addr:city'

delete a column family

delete   '/user/user01/customer',  'jsmith',   'addr:'
get   '/user/user01/customer',  'jsmith'

delete a row

delete '/user/user01/customer', 'jsmith'

Bonus Activity

  1. Create a table and pre-split into 4 regions and then import data.

  2. create '/user/user01/mynewtable', 'CF', {SPLITS => ['F', 'L', 'S']}

  3. See the description of the table: describe '/user/user01/mynewtable'

Exercise 2: Create a MapR-DB Table using MapR Control System (MCS)

The goal of this exercise is to get familiar with MCS and use it to perform Table creation and changing the properties for the column families.

Connect to MapR Control System and see Table Properties

  1. Connect to MapR cluster using MCS from a bowser using the notes from the instructor for host information. Login with your account username and password given to you in the class.

    https://ipaddress:8443

  2. Connect to the table ‘/user/user01/customer’ and see how many rows, regions are there for this table, Also note the properties for the column families.

    1. Click on MapR-Fs, MapR Tables on the left, the in the Go To Table text box enter ‘/user/user01/customer’ as shown in the image

Create a table using MCS

  1. Create a table called 'MCSNewTable'.

  2. Create column families similar to ‘mytesttable’. Change the Max versions, Compression, TTL.

  3. List the new table you created with pre-split and note the number of regions created.

  4. You can edit , delete tables without disabling them with MCS

Delete a Table with the shell

To delete a table with the hbase shell
 To delete a table, first disable it and then delete it.

disable   '/user/user01/customer'
drop   '/user/user01/customer'
Tutorial Category Reference: