This page shows you an example of using Apache Hive with a Dataproc Metastore service. In this example, you launch a Hive session on a Dataproc cluster, and then run sample commands to create a database and table.
Before you begin
- Create a Dataproc Metastore service.
- Attach the Dataproc Metastore service to a Dataproc cluster.
Connect to Apache Hive
To start using Hive, use SSH to connect to the Dataproc cluster that's associated with your Dataproc Metastore service. Once connected, you can run Hive commands from the SSH terminal window in your browser to manage your metadata.
To connect to Hive
- In the Google Cloud console, go to the VM Instances page.
- In the list of virtual machine instances, click SSH in the row of the Dataproc VM instance that you want to connect to.
A browser window opens in your home directory on the node with an output similar to the following:
Connected, host fingerprint: ssh-rsa ...
Linux cluster-1-m 3.16.0-0.bpo.4-amd64 ...
...
example-cluster@cluster-1-m:~$
To start Hive and create a database and table, run the following commands in the SSH session:
Start Hive.
hive
Create a database named
myDatabase
.create database myDatabase;
Show the database that you created.
show databases;
Use the database that you created.
use myDatabase;
Create a table named
myTable
.create table myTable(id int,name string);
List the tables under
myDatabase
.show tables;
Show the table rows in the table that you created.
desc MyTable;
Running the following commands generates output similar to the following:
$hive
hive> show databases;
OK
default
hive> create database myDatabase;
OK
hive> use myDatabase;
OK
hive> create table myTable(id int,name string);
OK
hive> show tables;
OK
myTable
hive> desc myTable;
OK
id int
name string