Service access

The Dataproc Metastore service requires full internal IP networking access to work correctly as described on this page.

Network connectivity

Dataproc Metastore uses private IP only, so no public IP is exposed. This means that only VMs on the provided Virtual Private Cloud (VPC) network or on-premises (connected through Cloud VPN or Cloud Interconnect) can access the Dataproc Metastore service.

Dataproc Metastore leverages VPC network peering to provide IP address connectivity to the Dataproc Metastore service's endpointUri.

For more information, see Networking.

Firewall rules for your services

In non-default or private environments with an established security footprint, you might need to create your own firewall rules. If you do, make sure not to create a firewall rule that blocks the IP addresses range or port of your Dataproc Metastore services.

When you create a Dataproc Metastore service, you can accept the default network for the service. The default network ensures full internal IP networking access for your VMs.

When you use a custom network, make sure your firewall rule permits traffic coming from/going to the Dataproc Metastore endpoint. To explicitly allow Dataproc Metastore traffic, run the following gcloud commands:

gcloud compute firewall-rules create dpms-allow-egress-DPMS_NETWORK-REGION --allow tcp --destination-ranges DPMS_NET_PREFIX/17 --network DPMS_NETWORK --direction OUT
gcloud compute firewall-rules create dpms-allow-ingress-DPMS_NETWORK-REGION --allow tcp,udp --source-ranges DPMS_NET_PREFIX/17 --network DPMS_NETWORK
  • For DPMS_NET_PREFIX, apply a /17 subnet mask to your Dataproc Metastore service IP.

    Note that you can find Dataproc Metastore IP information in the endpointUri configuration on the Service detail page.

Networks have an implied allow egress rule that normally allows access from your network to Dataproc Metastore. If you create deny egress rules that override the implied allow egress rule, you should create an allow egress rule with a higher priority to permit egress to the Dataproc Metastore IP.

Some features such as Kerberos require Dataproc Metastore to initiate connections to hosts in your project network. All networks have an implied deny ingress rule that will block these connections and prevent the those features from working. You should create a firewall rule that allows TCP and UDP ingress on all ports from the /17 IP block that contains the Dataproc Metastore IP.

For more information about firewall rules, see VPC firewall rules and Using VPC firewall rules.

Dataproc Metastore endpoint access

Once the service is created and the network configured, you'll have access to the thrift endpoint, endpointUri, for your Dataproc Metastore service. You can use the endpoint to point your client to the new service by following the instructions either for Creating a Dataproc cluster that uses the service or After you create a Dataproc Metastore service.

What's next