When you create a Dataproc Metastore service, you must choose to use either the MySQL database type or the Spanner database type.
This choice affects the features that you can integrate and use with your Dataproc Metastore service. It's important to note that you can't update your database type after you create a Dataproc Metastore service. Make sure you choose the appropriate database type for your needs.
This page explains the differences between these database types and how to select one for your service.
Differences between MySQL and Spanner
MySQL
The Dataproc Metastore MySQL database type is an implementation of Cloud SQL. Note the following when using a MySQL database:
- MySQL is the default database type when creating a Dataproc Metastore.
- MySQL is supported by all Hive versions.
- MySQL supports all Dataproc Metastore features.
- MySQL supports Dataproc Metastore encryption, such as using customer-managed encryption keys (CMEK).
Spanner
The Dataproc Metastore Spanner database type is an implementation of Spanner. Note the following when using a Spanner database:
- Spanner is only supported on Hive versions 2.3.6 and 3.1.2.
- Spanner only supports Avro imports.
Additional details
The following table provides additional details about these differences.
MySQL | Spanner | |
---|---|---|
Reliability (uptime) | Cloud SQL SLO 99.95%* | Spanner SLO 99.99%* |
Maintenance windows | Required | Not Required |
Notes:
- *The Cloud SQL and Spanner SLOs don't directly translate to Dataproc Metastore SLOs. Dataproc Metastore Your database type selection does not affect Dataproc Metastore SLOs.
- There's no pricing difference between the two database types.
Before you begin
- Enable Dataproc Metastore in your project.
- Understand networking requirements specific to your project.
Required Roles
To get the permission that you need to create a Dataproc Metastore, ask your administrator to grant you the following IAM roles on your project, based on the principle of least privilege:
-
Grant full control of Dataproc Metastore resources (
roles/metastore.editor
) -
Grant full access to all Dataproc Metastore resources, including IAM policy administration (
roles/metastore.admin
)
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the
metastore.services.create
permission,
which is required to
create a Dataproc Metastore.
You might also be able to get this permission with custom roles or other predefined roles.
For more information about specific Dataproc Metastore roles and permissions, see Manage Dataproc access with IAM.Choose your database type
You choose your database type when you first create a Dataproc Metastore service.
The following example shows an abbreviated version of the steps that you follow to choose a database type. For complete step-by-step instructions, see Create a Dataproc Metastore service.
Console
In the Google Cloud console, open the Dataproc Metastore page:
In the navigation bar, click Create.
The Create service page opens.
For Database type, select either MySQL or Spanner.
MySQL is the default database type.
Choose the remaining configurations for your service, as needed.
Click Submit.
gcloud CLI
Run the following
gcloud metastore services create
command:gcloud metastore services create SERVICE_ID \ --location=LOCATION \ --database-type=DATABASE_TYPE; default="mysql"
Replace the following:
SERVICE_ID
: the name or ID for your Dataproc Metastore service.LOCATION
: the region that your Dataproc Metastore service resides in.DATABASE_TYPE
: the database type that you want to set for your Dataproc Metastore service. Accepted values includemysql
andspanner
. The default value ismysql
.