This document provides instructions for installing and configuring Hive to work with a Hadoop environment. It explains that Hive needs to be installed on the same server as the Hadoop client. It then details how to download Hive, install MySQL for the Hive metastore, configure Hive to connect to MySQL by modifying configuration files, and concludes that Hive is now installed and ready for use.
Breaking the Kubernetes Kill Chain: Host Path Mount
R hive tutorial supplement 2 - Installing Hive
1. RHive tutorial – Installing Hive
As an add-on to Hadoop, Hive cannot run independent of Hadoop.
This tutorial explains how additionally configure Hive to the Hadoop
environment made in the Hadoop setup tutorial.
Hive does not need to be installed for all servers where Hadoop is installed.
It is just needed for the server where Hadoop client will run.
Here, Hive will be installed in Hadoop namenode for convenience in setup.
Downloading Hive
Because Hive is written in Java just like Hadoop is, downloading the file and
decompressing it alone constitutes the entire setup procedure.
A stable version of Hive can be found in the URL below.
http://www.apache.org/dist//hive/hive-0.7.1/hive-0.7.1-bin.tar.gz
The latest stable version at the point of this tutorial’s creation is 0.7.1; it is
alright to build a separate snapshot version and use that instead.
If version 0.8x is released, it’s okay to use that.
Like the following, connect to the target server for Hive installation and
download the final version of Hive.
ssh
root@10.1.1.1
mkdir
hive_stable
cd
hive_stable
wget
http://www.apache.org/dist//hive/hive-‐0.7.1/hive-‐0.7.1-‐
bin.tar.gz
tar
xvfz
./hive-‐0.7.1-‐bin.tar.gz
mkdir
/service
mv
./hive-‐0.7.1-‐bin
/service
Configuring MySQL
We will be choosing and using MySQL for Hive’s repository.
Hive uses SQLite by default, but if you want to enable multiple users to use
Hive simultaneously, you need to be able to use MySQL or other DBs as
repositories.
Hence this tutorial will be using MySQL.
2. MySQL can be installed in a server apart from the one where Hive will be
installed, but this tutorial will install MySQL along where, Hadoop namenode is
installed (that is, where Hive is installed).
Use yum like the following to install mysql client and server
yum
install
mysql
mysql-‐server
After installation, start mysql server.
/etc/init.d/mysqld
start
Now create a database in MySQL for Hive to use.
This tutorial will use the name “metastore” for that database.
Go into mysql and create a database like shown.
mysql>
CREATE
DATABASE
metastore;
mysql>
USE
metastore;
mysql>
SOURCE
/service/hive-‐
0.7.1/scripts/metastore/upgrade/mysql/hive-‐schema-‐
0.7.0.mysql.sql;
Here, “/service/hive-0.7.1" is Hive’s HOME directory, and
"$HIVE_HOME/scripts/metastore/upgrade/mysql" contains SQL files that can
set MySQL up for Hive or upgrade Hive database (which exists in MySQL
database). Select and run the ones suited to your version and complete setup.
Now create a MySQL user which Hive will use, and grant it the privilege to
access the database, “metastore”.
mysql>
CREATE
USER
'hiveuser'@'%'
IDENTIFIED
BY
'password';
mysql>
GRANT
SELECT,INSERT,UPDATE,DELETE
ON
metastore.*
TO
'hiveuser'@'%';
mysql>
REVOKE
ALTER,CREATE
ON
metastore.*
FROM
'hiveuser'@'%';
Configuring MySQL for Hive use is complete.
Now you need to configure Hive to enable it to connect to MySQL.
JDBC is required for allowing connecting to MySQL from Hive but JDBC is not
included in MySQL.
3. You must manually download it from the MySQL site and copy it into the
installed Hive.
JDBC is available for download from here:
http://dev.mysql.com/downloads/
As shown below, download and decompress it, then copy the jar file to
Hadoop’s lib directory.
$
curl
http://dev.mysql.com/get/Downloads/Connector-‐J/mysql-‐
connector-‐java-‐
5.1.18.tar.gz/from/http://mirror.services.wisc.edu/mysql/
$
tar
xvfz
mysql-‐connector-‐java-‐5.1.18.tar.gz
$
cp
./mysql-‐connector-‐java-‐5.1.18/mysql-‐connector-‐java-‐5.1.18-‐
bin.jar
/service/hive-‐0.7.1/lib/
Now for modifying Hive’s configuration.
You need to open $HIVE_HOME/conf/hive-site.xml with a text editor and
appropriately adjust the items pertaining to MySQL.
If you have installed Hive for the first time, then the $HIVE_HOME/conf/hive-
site.xml file probably doesn’t exist in the directory.
Copy the hive-default.xml.template file in the same directory and edit it like the
following:
cd
/service/hive-‐0.7.1/conf/hive-‐site.xml
cp
./hive-‐default.xml.template
./hive-‐site.xml
Now look for the contents below in ./hive-site.xml and make suitable
adjustments for the MySQL accounts which will be used to connect to the
MySQL server.
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://MYSQL_HOSTNAME/metastore</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
4. </property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
From above, you should replace the “MYSQL_HOSTNAME” string to be the
hostname or the IP address of the server where MySQL is installed to. But this
tutorial installed MySQL in the very server where Hive was installed so it is
127.0.0.1.
If, for purposes of safer management or some other reason, you installed
MySQL to some other server then just replace “MYSQL_HOSTNAME” string
with its IP address.
Thus concludes the installation and configuration of Hive.
Consult Hive’s official site documents for a detailed usage guide for Hive.