Setting up HiveServer2 Authentication using PAM

In this blog I will show you how set up authentication for HiveServer2 (HS2) using pluggable authentication module (PAM). Once configured, all HS2 clients (JDBC and ODBC) will require a valid username and password to connect. A validation error will be thrown if an invalid username and password is passed. This authentication doesn’t apply to hive cli (command line interface) as it doesn’t go through HS2. Please remember that HS2 authentication only controls connection to hive and not the actual data. Data stored on Hadoop cluster is still authorized using file system permissions. The identity used depends on whether impersonation is enabled.

Notes:

1. If your organization relies heavily on LDAP, you can also use LDAP authentication to control access to HS2. LDAP authentication configuration is not covered in this blog.

2. This blog is based on Hive-0.11 released by MapR. The default location for hive-site.xml is /opt/mapr/hive/hive-0.11/conf

3. This blog is based on MapR 3.0.2 release with M5 license installed. In MapR 3.1.0 security release the authentication is enabled by default and you do not need to configure it

I highly recommend installing HS2 and hive METASTORE on either a separate control node where mapr-core package is installed or on one of the data node. This will bring HS2 and METASTORE under mapr-warden’s control and will be managed by warden whenever mapr-warden is stopped/started/restarted. Also, maprcli is part of mapr-core package that allows you to manage individual services running on a host. Running HS2 and METASTORE on a mapr client node doesn’t give you flexibility of maprcli and mapr-warden.

I also recommend enabling user impersonation for HS2. User impersonation enables HS2 to submit jobs as a particular user. Without impersonation, HS2 submits jobs as the user that started the HiveServer2 process. On a MapR cluster, this user is typically the mapr user or the user specified in the MAPR_USER environment variable. To enable impersonation please follow MapR installation documentation.

To set up HS2 authentication, perform following steps:

  1. Configure HS2 authentication parameters in hive-site.xml
  2. Make sure libjpam.so is installed in correct location
  3. Restart HS2
  4. Test HS2 Authentication

Configure HS2 authentication parameters in hive-site.xml

HS2 authentication is configured using three parameters defined in hive-site.xml. These parameters are:

hive.server2.authentication

This parameter defines the authentication mode that HS2 is going to use while authenticating username and password. Four options NONE (default), KERBEROS, LDAP and CUSTOM are supported. For the purpose of this blog we are going to use CUSTOM.

hive.server2.custom.authentication.class

When hive.server2.authentication is set to CUSTOM you must specify the authentication class explicitly. In case of MapR installation this value will be org.apache.hive.service.auth.PamAuthenticationProvider

hive.server2.authentication.pam.profiles

This parameter defines a comma separated list of pam modules that will be used for verification of username and password. In my configuration I am using sshd and sudo as these are the defaults and the PAM module that we use for password authentication. The values for this parameter are sshd, sudo. To make these changes open /opt/mapr/hive/hive-0.11/conf/hive-site.xml and add these parameters.

Below is the sample configuration:

<configuration>

<property>

<name>hive.server2.authentication</name>

<value>CUSTOM</value>

</property>

<property>

<name>hive.server2.custom.authentication.class</name>

<value>org.apache.hive.service.auth.PamAuthenticationProvider</value>

</property>

<property>

<name>hive.server2.authentication.pam.profiles</name>

<value>sshd,sudo</value>

</property>

</configuration>

Optionally, you can also enable SSL to protect user id and password. Please follow MapR installation documentation to enable SSL for Hive.

Make sure libjpam.so is installed in correct location

The most important piece to enable HS2 authentication is libjpam.so library file. MapR installation automatically installs the libjpam.so file in correct location. In case you are running HS2 on a node that only has mapr-client package installed and the library file is missing, you can take it from one of the data node and copy it to default location. I highly recommend contacting MapR support if you are not able to get this file yourself.  

Note: The default location for libjpam.so is /opt/mapr/hadoop/hadoop-0.20.2/lib/native/Linux-amd64-64/ for a 64 bit installation and /opt/mapr/hadoop/hadoop-0.20.2/lib/native/Linux-i386-32/ for 32 bit installation.

Restart HS2

Restarting HS2 depends on how it is installed. HS2 can be installed using following two modes:

Warden Managed: In this mode HS2 process is completely managed by mapr-warden service. This is the recommended method and requires HS2 to be installed on either one of the data node in your cluster or a separate control node that has mapr-core package installed. Use following command to restart HS2:

maprcli node services –nodes <node list> -name hiveserver2 –action restart

example:

maprcli node services –nodes host01 –name hiveserver2 –action restart

 Unmanaged: In unmanaged mode HS2 is not managed by mapr-warden service and requires manual intervention to stop/start/restart HS2. Following two methods can be used under unmanaged mode.

  1. HIVE CLI: To restart HS2 using HIVE cli you need to do following:
    If HS2 is not already running, just start it
    /opt/mapr/hive/hive-0.11/ bin/hive --service hiveserver2
    If HS2 is already running:
    Get the PID of HS2  using either “jps –m” command or “ps –ef | grep hiveserver2”
    Kill the existing running HS2 process using “kill” command. Example:
    kill -9 2343
    Start HS2 again:
    /opt/mapr/hive/hive-0.11/ bin/hive --service hiveserver2
  2. Init scripts: If you have created init scripts for HS2 process then you can use service command to restart the HS2 process.
    # service <init script name> restart           
    Example:
    # service hive-server2 restart

Test HS2 Authentication

JDBC Client BEELINE

Start the beeline command line

 /opt/mapr/hive/hive-0.11/bin/beeline

Setting up HiveServer2 Authentication using PAM

Issue the connect command:

!connect jdbc:hive2://localhost:10000/default

You will be prompted for a username and password. If you enter an invalid credentials you will see an error similar to given below:

Setting up HiveServer2 Authentication using PAM

If you enter valid credentials, you will be connected successfully.

Setting up HiveServer2 Authentication using PAM

ODBC Client using MapR ODBC Connector for Hive

To test HS2 authentication using ODBC driver you need to perform following steps:

Note: I assume that you have already downloaded and installed MapR Hive ODBC Connector. If not, please follow MapR installation documentation on how to get HIVE ODBC connection and install it.

Go to Start -> All Programs -> MapR ODBC Hive Connector 2.0 (32 Bit) -> 32 Bit ODBC Driver Manager

Note: If you have installed 64 Bit driver you will see 64 Bit instead of 32 Bit.

HiveServer Authentication

Click on “Add…” button and select MapR Hive ODBC Connector from the list of available drivers and then click “Finish”.

HiveServer2 Authentication

Please enter all the information:

HiveServer

HiveServer

If you were able to connect through BEELINE, you should not have any problem through ODBC as well.

no

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams

 

 

 

Download for free