Using VPC Service Controls with Cloud Data Fusion

VPC Service Controls allows you to define a security perimeter around resources of Google-managed services to control communication to and between those services.

VPC Service Controls provides additional security for your Cloud Data Fusion instances and pipelines to help mitigate the risk of data exfiltration. This guide outlines the limitations and strategies to follow when using VPC Service Controls with Cloud Data Fusion:

  • If you create Cloud Data Fusion instances with a private IP address, you can further protect them using VPC Service Controls. Create your Cloud Data Fusion private instances in Google Cloud projects that are within your service perimeters. Within a private instance, plugins packaged with the instance follow the restrictions applied by the service perimeter.

  • Cloud Data Fusion pipelines are executed on Dataproc clusters. To protect a Dataproc cluster launched within the service perimeter, it must have an internal private IP address only (no public IP address) and be in the same private VPC network as your Cloud Data Fusion instance. A Cloud Data Fusion instance with a private IP address will by default create a Dataproc cluster with an internal private IP address during Cloud Data Fusion pipeline execution.

  • Don't use plugins that use Google Cloud APIs that are not supported by VPC Service Controls. If you do use such plugins, Cloud Data Fusion will block the API calls, resulting in pipeline preview and execution failure.

What's next