Use custom constraints

Google Cloud Organization Policy gives you centralized, programmatic control over your organization's resources. As the organization policy administrator, you can define an organization policy, which is a set of restrictions called constraints that apply to Google Cloud resources and descendants of those resources in the Google Cloud resource hierarchy. You can enforce organization policies at the organization, folder, or project level.

Organization Policy provides predefined constraints for various Google Cloud services. However, if you want more granular, customizable control over the specific fields that are restricted in your organization policies, you can also create custom constraints and use those custom constraints in an organization policy.

Benefits

You can use a custom organization policy to allow or deny specific operations on Dataproc Serverless batches. For example, if a request to create a batch workload fails to satisfy custom constraint validation as set by your organization policy, the request will fail, and an error will be returned to the caller.

Policy inheritance

By default, organization policies are inherited by the descendants of the resources on which you enforce the policy. For example, if you enforce a policy on a folder, Google Cloud enforces the policy on all projects in the folder. To learn more about this behavior and how to change it, refer to Hierarchy evaluation rules.

Pricing

The Organization Policy Service, including predefined and custom constraints, is offered at no charge.

Before you begin

  1. Set up your project
    1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
    2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

      Go to project selector

    3. Make sure that billing is enabled for your Google Cloud project.

    4. Enable the Dataproc Serverless API.

      Enable the API

    5. Install the Google Cloud CLI.
    6. To initialize the gcloud CLI, run the following command:

      gcloud init
    7. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

      Go to project selector

    8. Make sure that billing is enabled for your Google Cloud project.

    9. Enable the Dataproc Serverless API.

      Enable the API

    10. Install the Google Cloud CLI.
    11. To initialize the gcloud CLI, run the following command:

      gcloud init
    12. Ensure that you know your organization ID.

Required roles

To get the permissions that you need to manage organization policies, ask your administrator to grant you the Organization policy administrator (roles/orgpolicy.policyAdmin) IAM role on the organization resource. For more information about granting roles, see Manage access to projects, folders, and organizations.

This predefined role contains the permissions required to manage organization policies. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to manage organization policies:

  • orgpolicy.constraints.list
  • orgpolicy.policies.create
  • orgpolicy.policies.delete
  • orgpolicy.policies.list
  • orgpolicy.policies.update
  • orgpolicy.policy.get
  • orgpolicy.policy.set

You might also be able to get these permissions with custom roles or other predefined roles.

Create a custom constraint

A custom constraint is defined in a YAML file by the resources, methods, conditions, and actions it is applied to. Dataproc Serverless supports custom constraints that are applied to the CREATE method of the BATCH resource (see Dataproc Serverless constraints on resources and operations).

To create a YAML file for a Dataproc Serverless custom constraint:

name: organizations/ORGANIZATION_ID/customConstraints/CONSTRAINT_NAME
resourceTypes:
- dataproc.googleapis.com/Batch
methodTypes: 
- CREATE
condition: CONDITION
actionType: ACTION
displayName: DISPLAY_NAME
description: DESCRIPTION

Replace the following:

  • ORGANIZATION_ID: your organization ID, such as 123456789.

  • CONSTRAINT_NAME: the name you want for your new custom constraint. A custom constraint must start with custom., and can only include uppercase letters, lowercase letters, or numbers, for example, custom.batchMustHaveSpecifiedCategoryLabel. The maximum length of this field is 70 characters, not counting the prefix, for example, organizations/123456789/customConstraints/custom.

  • CONDITION: a CEL condition that is written against a representation of a supported service resource. This field has a maximum length of 1000 characters. See Supported resources for more information about the resources available to write conditions against. Sample condition: ("category" in resource.labels) && (resource.labels['category'] in ['retail', 'ads', 'service']).

  • ACTION: the action to take if the condition is met. This can be either ALLOW or DENY.

  • DISPLAY_NAME: a human-friendly name for the constraint. Sample display name: "Enforce batch 'category' label requirement". This field has a maximum length of 200 characters.

  • DESCRIPTION: a human-friendly description of the constraint to display as an error message when the policy is violated. This field has a maximum length of 2000 characters. Sample description: "Only allow Dataproc batch creation if it has a 'category' label with a 'retail', 'ads', or 'service' value".

For more information about how to create a custom constraint, see Defining custom constraints.

Set up a custom constraint

After you have created the YAML file for a new custom constraint, you must set it up to make it available for organization policies in your organization. To set up a custom constraint, use the gcloud org-policies set-custom-constraint command:
gcloud org-policies set-custom-constraint CONSTRAINT_PATH
Replace CONSTRAINT_PATH with the full path to your custom constraint file. For example, /home/user/customconstraint.yaml. Once completed, your custom constraints are available as organization policies in your list of Google Cloud organization policies. To verify that the custom constraint exists, use the gcloud org-policies list-custom-constraints command:
gcloud org-policies list-custom-constraints --organization=ORGANIZATION_ID
Replace ORGANIZATION_ID with the ID of your organization resource. For more information, see Viewing organization policies.

Enforce a custom constraint

You can enforce a boolean constraint by creating an organization policy that references it, and then applying that organization policy to a Google Cloud resource.

Console

  1. In the Google Cloud console, go to the Organization policies page.

    Go to Organization policies

  2. From the project picker, select the project for which you want to set the organization policy.
  3. From the list on the Organization policies page, select your constraint to view the Policy details page for that constraint.
  4. To configure the organization policy for this resource, click Manage policy.
  5. On the Edit policy page, select Override parent's policy.
  6. Click Add a rule.
  7. In the Enforcement section, select whether enforcement of this organization policy is on or off.
  8. Optional: To make the organization policy conditional on a tag, click Add condition. Note that if you add a conditional rule to an organization policy, you must add at least one unconditional rule or the policy cannot be saved. For more information, see Setting an organization policy with tags.
  9. If this is a custom constraint, you can click Test changes to simulate the effect of this organization policy. For more information, see Test organization policy changes with Policy Simulator.
  10. To finish and apply the organization policy, click Set policy. The policy requires up to 15 minutes to take effect.

gcloud

To create an organization policy that enforces a boolean constraint, create a policy YAML file that references the constraint:

      name: projects/PROJECT_ID/policies/CONSTRAINT_NAME
      spec:
        rules:
        - enforce: true
    

Replace the following:

  • PROJECT_ID: the project on which you want to enforce your constraint.
  • CONSTRAINT_NAME: the name you defined for your custom constraint. For example, custom.batchMustHaveSpecifiedCategoryLabel.

To enforce the organization policy containing the constraint, run the following command:

    gcloud org-policies set-policy POLICY_PATH
    

Replace POLICY_PATH with the full path to your organization policy YAML file. The policy requires up to 15 minutes to take effect.

Test the custom constraint

The following batch creation example assumes a custom constraint has been created and enforced on batch creation to require that the batch has a "category" label attached with a value of "retail", "ads" or "service: ("category" in resource.labels) && (resource.labels['category'] in ['retail', 'ads', 'service']). Note that the "category" label in the example does not have one of the required values.

gcloud dataproc batches submit spark \
  --region us-west1
  --jars file:///usr/lib/spark/examples/jars/spark-examples.jar \
  --class org.apache.spark.examples.SparkPi  \
  --network default \
  --labels category=foo \
  -- 100

Sample output:

Operation denied by custom org policies: ["customConstraints/custom.batchMustHaveSpecifiedCategoryLabel": ""Only allow Dataproc batch creation if it has a 'category' label with
  a 'retail', 'ads', or 'service' value""]

Dataproc Serverless constraints on resources and operations

The following Dataproc Serverless custom constraints are available to use when you create (submit) a batch workload.

General

  • resource.labels

PySparkBatch

  • resource.pysparkBatch.mainPythonFileUri
  • resource.pysparkBatch.args
  • resource.pysparkBatch.pythonFileUris
  • resource.pysparkBatch.jarFileUris
  • resource.pysparkBatch.fileUris
  • resource.pysparkBatch.archiveUris

SparkBatch

  • resource.sparkBatch.mainJarFileUri
  • resource.sparkBatch.mainClass
  • resource.sparkBatch.args
  • resource.sparkBatch.jarFileUris
  • resource.sparkBatch.fileUris
  • resource.sparkBatch.archiveUris

SparRBatch

  • resource.sparkRBatch.mainRFileUri
  • resource.sparkRBatch.args
  • resource.sparkRBatch.fileUris
  • resource.sparkRBatch.archiveUris

SparkSqlBatch

  • resource.sparkSqlBatch.queryFileUri
  • resource.sparkSqlBatch.queryVariables
  • resource.sparkSqlBatch.jarFileUris

RuntimeConfig

  • resource.runtimeConfig.version
  • resource.runtimeConfig.containerImage
  • resource.runtimeConfig.properties
  • resource.runtimeConfig.repositoryConfig.pypiRepositoryConfig.pypiRepository
  • resource.runtimeConfig.autotuningConfig.scenarios
  • resource.runtimeConfig.cohort

ExecutionConfig

  • resource.environmentConfig.executionConfig.serviceAccount
  • resource.environmentConfig.executionConfig.networkUri
  • resource.environmentConfig.executionConfig.subnetworkUri
  • resource.environmentConfig.executionConfig.networkTags
  • resource.environmentConfig.executionConfig.kmsKey
  • resource.environmentConfig.executionConfig.idleTtl
  • resource.environmentConfig.executionConfig.ttl
  • resource.environmentConfig.executionConfig.stagingBucket

PeripheralsConfig

  • resource.environmentConfig.peripheralsConfig.metastoreService
  • resource.environmentConfig.peripheralsConfig.sparkHistoryServerConfig.dataprocCluster

Example custom constraints for common use cases

The following table provides examples of Dataproc Serverless batch custom constraints:

Description Constraint syntax
Batch must attach a "category" label with allowed values.

    name: organizations/ORGANIZATION_ID/customConstraints/custom.batchMustHaveSpecifiedCategoryLabel
    resourceTypes:
    - dataproc.googleapis.com/Batch
    methodTypes:
    - CREATE
    condition: ("category" in resource.labels) && (resource.labels['category'] in ['retail', 'ads', 'service'])
    actionType: ALLOW
    displayName: Enforce batch "category" label requirement.
    description: Only allow batch creation if it attaches a "category" label with an allowable value.
Batch must set an allowed runtime version.

    name: organizations/ORGANIZATION_ID/customConstraints/custom.batchMustUseAllowedVersion
    resourceTypes:
    - dataproc.googleapis.com/Batch
    methodTypes:
    - CREATE
    condition:  (has(resource.runtimeConfig.version)) && (resource.runtimeConfig.version in ["2.0.45", "2.0.48"])
    actionType: ALLOW
    displayName: Enforce batch runtime version.
    description: Only allow batch creation if it sets an allowable runtime version.
Must use SaprkSQL.

    name: organizations/ORGANIZATION_ID/customConstraints/custom.batchMustUseSparkSQL
    resourceTypes:
    - dataproc.googleapis.com/Batch
    methodTypes:
    - CREATE
    condition: (has(resource.sparkSqlBatch))
    actionType: ALLOW
    displayName: Enforce batch only use SparkSQL Batch.
    description: Only allow creation of SparkSQL Batch.
Batch must set TTL less than 2h.

    name: organizations/ORGANIZATION_ID/customConstraints/custom.batchMustSetLessThan2hTtl
    resourceTypes:
    - dataproc.googleapis.com/Batch
    methodTypes:
    - CREATE
    condition:  (has(resource.environmentConfig.executionConfig.ttl)) && (resource.environmentConfig.executionConfig.ttl <= duration('2h'))
    actionType: ALLOW
    displayName: Enforce batch TTL.
    description: Only allow batch creation if it sets an allowable TTL.
Batch cannot set more than 20 Spark initial executors.

    name: organizations/ORGANIZATION_ID/customConstraints/custom.batchInitialExecutorMax20
    resourceTypes:
    - dataproc.googleapis.com/Batch
    methodTypes:
    - CREATE
    condition: (has(resource.runtimeConfig.properties)) && ('spark.executor.instances' in resource.runtimeConfig.properties)
     && (int(resource.runtimeConfig.properties['spark.executor.instances'])>20)
    actionType: DENY
    displayName: Enforce maximum number of batch Spark executor instances.
    description: Deny batch creation if it specifies more than 20 Spark executor instances.
Batch cannot set more than 20 Spark dynamic allocation initial executors.

    name: organizations/ORGANIZATION_ID/customConstraints/custom.batchDynamicAllocationInitialExecutorMax20
    resourceTypes:
    - dataproc.googleapis.com/Batch
    methodTypes:
    - CREATE
    condition: (has(resource.runtimeConfig.properties)) && ('spark.dynamicAllocation.initialExecutors' in resource.runtimeConfig.properties)
     && (int(resource.runtimeConfig.properties['spark.dynamicAllocation.initialExecutors'])>20)
    actionType: DENY
    displayName: Enforce maximum number of batch dynamic allocation initial executors.
    description: Deny batch creation if it specifies more than 20 Spark dynamic allocation initial executors.
Batch must not allow more than 20 dynamic allocation executors.

    name: organizations/ORGANIZATION_ID/customConstraints/custom.batchDynamicAllocationMaxExecutorMax20
    resourceTypes:
    - dataproc.googleapis.com/Batch
    methodTypes:
    - CREATE
    condition: (resource.runtimeConfig.properties['spark.dynamicAllocation.enabled']=='false') || (('spark.dynamicAllocation.maxExecutors' in resource.runtimeConfig.properties) && (int(resource.runtimeConfig.properties['spark.dynamicAllocation.maxExecutors'])<=20))
    actionType: ALLOW
    displayName: Enforce batch maximum number of dynamic allocation executors.
    description:  Only allow batch creation if dynamic allocation is disabled or
    the maximum number of dynamic allocation executors is set to less than or equal to 20.
Batch must set the KMS key to an allowed pattern.

    name: organizations/ORGANIZATION_ID/custom.batchKmsPattern
    resourceTypes:
    - dataproc.googleapis.com/Batch
    methodTypes:
    - CREATE
    condition:  matches(resource.environmentConfig.executionConfig.kmsKey, '^keypattern[a-z]$')
    actionType: ALLOW
    displayName: Enforce batch KMS Key pattern.
    description: Only allow batch creation if it sets the KMS key to an allowable pattern.
Batch must set the staging bucket prefix to an allowed value.

    name: organizations/ORGANIZATION_ID/customConstraints/custom.batchStagingBucketPrefix
    resourceTypes:
    - dataproc.googleapis.com/Batch
    methodTypes:
    - CREATE
    condition:  resource.environmentConfig.executionConfig.stagingBucket.startsWith(ALLOWED_PREFIX)
    actionType: ALLOW
    displayName: Enforce batch staging bucket prefix.
    description: Only allow batch creation if it sets the staging bucket prefix to ALLOWED_PREFIX.
Batch executor memory setting must end with an 'm' suffix and be less than 20000 m.

    name: organizations/ORGANIZATION_ID/customConstraints/custom.batchExecutorMemoryMax
    resourceTypes:
    - dataproc.googleapis.com/Batch
    methodTypes:
    - CREATE
    condition:  ('spark.executor.memory' in resource.runtimeConfig.properties) && (resource.runtimeConfig.properties['spark.executor.memory'].endsWith('m')) && (int(resource.runtimeConfig.properties['spark.executor.memory'].split('m')[0])<20000)
    actionType: ALLOW
    displayName: Enforce batch executor maximum memory.
    description: Only allow batch creation if the executor memory setting ends with an 'm' suffix and is less than 20000 m.

What's next