Project Settings Page

Review and modify the settings related to your Google Cloud Platform project.

SettingDescription
Project ID

(read only) Your current project.

Use the Project menu to select a different project. For more information, see Projects Menu.

Disable Dataprep

To disable Cloud Dataprep by TRIFACTA INC. for this project, click the link.

NOTE: To remove a user and his or her assets from a project, please contact Support.

Dataflow Execution Settings:

SettingDescription
RegionA region is a specific geographical location where you can run your resources.
Zone

A sub-section of region, a zone contains specific resources.

Select Auto Zone to allow the platform to choose the zone for you.

Machine TypeChoose the type of machine on which to run your job. The default is n1-standard-1.

Making changes to Region, Zone, or Machine Type can affect the time and cost of job executions. For more information, see https://cloud.google.com/dataflow/docs/concepts/regional-endpoints.

For more information on machine types, https://cloud.google.com/compute/docs/machine-types.

Advanced Settings:

SettingDescription
VPC network mode

Select the network mode to use.

If the network mode is set to Auto (default), the job is executed over publicly available IP addresses. Do not set values for Network, Subnetwork, and Worker IP address configuration.

NOTE: Unless you have specific reasons to modify these settings, you should leave them as the default values. These network settings apply to job execution. Preview and sampling use the default network settings.

For Custom VPC networks:

  1. Specify the name of the VPC network in your region.
  2. Specify the name of the Subnetwork to the VPC network. If both Network and Subnetwork are specified, Subnetwork is used.
  3. Review and specify the Worker IP address configuration setting. See below.

For more information:

NetworkTo use a different VPC network, enter the name of the VPC network to use as an override for this job. Click Save to apply the override.
SubnetworkTo specify a different sub-network, enter the name of the sub-network. Click Save to apply the override.
Worker IP address configuration

If the VPC Network mode is set to custom, then choose one of the following:

  • Allow public IP addresses - Use Cloud Dataflow workers that are available through public IP addresses. No further configuration is required.
  • Use internal IP addresses only - Cloud Dataflow workers use private IP addresses for all communication.
    • If a Subnetwork is specified, then the Network value is ignored.
    • The specified Network or Subnetwork must have Private Google Access enabled.

Feature Availability: This feature is available in Cloud Dataprep Premium by TRIFACTA® INC.

SettingDescription
Autoscaling Algorithms

The type of algorithm to use to scale the number of Google Compute Engine instances to accommodate the size of your job. Possible values:

  • Throughput based - Scaling is determined by the volume of data expected to be passed through Cloud Dataflow.
  • None - None algorithm is applied.
    • If none is selected, use numWorkers to specify a fixed number of Google Compute Engine instances.
Initial number of workersNumber of Google Compute Engine instances with which to launch the job. This number may be adjusted as part of job execution. This number must be an integer between 1 and 1000, inclusive.
Maximum number of workers

Maximum number of Google Compute Engine instances to use during execution. This number must be an integer between 1 and 1000, inclusive, and must be greater than the initial number of workers.

Service account

Email address of the service account under which to run the job.

Labels

Create or assign labels to apply to the billing for the Cloud Dataprep by TRIFACTA INC. jobs run in your project. You may reference up to 64 labels.

NOTE: Each label must have a unique key name.

For more information, see https://cloud.google.com/resource-manager/docs/creating-managing-labels.

Notes on behavior:

  • Values specified here are applied to all jobs executed within the project.
  • If property values are not specified here, then the properties are not passed in with any job execution, and the default Cloud Dataprep by TRIFACTA INC. property values are used.
  • The property values specified here can be overridden by property values specified for individual jobs. For more information, see Dataflow Execution Settings.