This page shows you an example of using Apache Hive with a Dataproc Metastore service. In this example, you launch a Hive session on a Dataproc cluster and run some sample commands to create a database and table.
Before you begin
- Create a Dataproc Metastore service.
- Attach the Dataproc Metastore service to a Dataproc cluster.
Connect to Apache Hive
To start using Hive you can SSH into the Dataproc cluster that's associated with your Dataproc Metastore service. After, you SSH into the cluster, you can run Hive commands to manage your metadata.
To connect to Hive
- In the Google Cloud console, go to the VM Instances page.
- In the list of virtual machine instances, click SSH in the row of the Dataproc VM instance that you want to connect to.
A browser window opens in your home directory on the node with an output similar to the following:
Connected, host fingerprint: ssh-rsa ...
Linux cluster-1-m 3.16.0-0.bpo.4-amd64 ...
...
example-cluster@cluster-1-m:~$
To start Hive and create a database and table, run the following commands in the SSH session:
Start Hive.
hive
Create a database called
myDatabase
.create database myDatabase;
Show the database you created.
show databases;
Use the database you created.
use myDatabase;
Create a table called
myTable
.create table myTable(id int,name string);
List the tables under
myDatabase
.show tables;
Show the table rows in the table you created.
desc MyTable;
Running these commands shows an output similar to the following:
$hive
hive> show databases;
OK
default
hive> create database myDatabase;
OK
hive> use myDatabase;
OK
hive> create table myTable(id int,name string);
OK
hive> show tables;
OK
myTable
hive> desc myTable;
OK
id int
name string