This page provides guidance on diagnosing and resolving common connectivity issues when connecting Dataproc clusters or Dataproc Serverless workloads to a managed Dataproc Metastore service.
Common Symptoms and Error Messages
When Dataproc encounters connectivity problems with Dataproc Metastore, you might see errors such as:
Unable to connect to Hive Metastore
Connection refused
Host unreachable
javax.jdo.JDOException
or similar database connection errors- Timeout errors when attempting to list databases or tables, or when submitting Spark or Hive jobs that interact with the Metastore.
Common Causes and Troubleshooting Steps
This section outlines common reasons why Dataproc Metastore connectivity issues occur and provides specific troubleshooting steps for each.
1. Network Configuration Issues
Network misconfigurations are the most frequent cause of connectivity failures between Dataproc workloads and Dataproc Metastore.
Virtual Private Cloud Network Peering or Private Service Access:
- Dataproc Metastore instances are typically accessed using a private IP address range using a Virtual Private Cloud network peering connection (specifically, Private Service Access).
- Verify Peering Status: Verify the Virtual Private Cloud peering connection between your Dataproc workload's Virtual Private Cloud network and the service producer network for Dataproc Metastore is active and healthy. You can check this in the Google Cloud console under VPC Network > VPC Network Peering.
- IP Range Allocation: Confirm that a sufficient IP range has been allocated for Private Service Access in your Virtual Private Cloud network.
Firewall Rules:
- Verify that firewall rules in your Dataproc workload's Virtual Private Cloud network allow outbound traffic on the port used by Dataproc Metastore (default is 9083).
- Verify there are no overly restrictive ingress rules on the service producer network side that would block traffic from your Dataproc workload.
DNS Resolution:
- Confirm that the Metastore endpoint hostname (e.g.,
your-metastore-endpoint.us-central1.dataproc.cloud.google.com
) resolves correctly to a private IP address from your Dataproc cluster or Dataproc Serverless environment. - Issues with Cloud DNS private zones or DNS forwarding can cause resolution failures.
- Confirm that the Metastore endpoint hostname (e.g.,
Troubleshooting Steps (Network):
- Check Dataproc Metastore Connectivity Information:
- In the Google Cloud console, navigate to Dataproc Metastore and select your instance.
- Note the Endpoint URI and the Network it's connected to.
- Verify Virtual Private Cloud Peering or Private Service Access:
- Go to VPC Network > VPC Network Peering. Confirm the peering
connection to
servicenetworking-googleapis-com
isACTIVE
.
- Go to VPC Network > VPC Network Peering. Confirm the peering
connection to
- Use Connectivity Tests: Use Google Cloud's Connectivity Tests to diagnose the network path from a Compute Engine VM in your Dataproc workload's subnet to the Dataproc Metastore endpoint IP address and port.
- Check Firewall Logs: If firewall rules are suspected, analyze Cloud Firewall logs for denied connections.
2. IAM Permissions
The service account used by your Dataproc workload needs appropriate IAM roles to access Dataproc Metastore.
- Required Role: The service account must have the
Dataproc Metastore User role (
roles/datametastore.user
) on the Dataproc Metastore instance or project. - Service Agent Permissions: Verify the Dataproc service agent has sufficient permissions if Dataproc is implicitly accessing the Metastore.
Troubleshooting Steps (IAM):
- Identify Service Account: Determine the service account used by your Dataproc cluster or Dataproc Serverless batch.
- Verify IAM Roles: Go to IAM & Admin > IAM in the Google Cloud console.
Check the roles assigned to the service account on the Dataproc Metastore
project or instance. Grant
roles/datametastore.user
if missing. - For more details on service account configuration, refer to:
3. Incorrect Endpoint Configuration
The Dataproc workload must be configured with the correct Dataproc Metastore endpoint URI.
Troubleshooting Steps (Endpoint):
- Verify Endpoint URI: Double-check the
hive.metastore.uris
Spark property or any other configuration used to specify the Dataproc Metastore endpoint in your workload submission. Verify it matches the Endpoint URI from your Dataproc Metastore instance details.
4. Other Considerations
- Metastore Status: Verify that your Dataproc Metastore
instance is in a
HEALTHY
state in the Google Cloud console. If it's unhealthy, address the Metastore's internal issues first. - Version Compatibility: While rare, verify there are no known compatibility issues between your Dataproc image version and the Dataproc Metastore version.
- SQL Proxy versus Managed Service: If you are using Cloud SQL as a
Metastore using the
cloud-sql-proxy.sh
initialization action, refer to its specific troubleshooting in the Cloud SQL Proxy initialization action README.
What's next
- Review the Dataproc Metastore documentation.
- See Connectivity and networking error scenarios.
- Refer to general troubleshooting guides for network issues: