Set up policies

This page gives a brief overview of folders and explains how to manage documents using folders.

Policy Engine and rules

In Document Warehouse, Policy Engine let users to define and execute common operations on documents (for example, validate or update) while creating or updating documents.

Rules and RuleSet

A Rule at a high level refers to a user-defined configuration that specifies the following:

  • what triggers the rules checking,
  • which condition is evaluated, and
  • what actions are run when the condition is satisfied.

Along with these specifications, a rule includes information of the description, source, target and triggering condition.

A logical collection of rules is called a RuleSet. For example, Rules operating on the same schema can be grouped together into a single RuleSet. Customers can define multiple RuleSets.

Rules are useful for automatically triggering predefined actions, while creating or updating documents.

A Rule consists of three main things:

  • TriggerType: Event on which the rule check should be initiated. Create and Update are the supported trigger types.
  • Rule Condition: The condition that is evaluated after a certain trigger type is detected. Conditions can be expressed using Common Expression Language (CEL). Each condition should evaluate to boolean output.
  • Actions: Set of steps that are executed when the rule is satisfied. When a rule condition is evaluated as true, then the corresponding action (configured in the rule) is executed. The following are the high-level details about specific actions implemented in Document Warehouse:
    • Data Validation action: Action that enables validating specific fields in the document during document creation or update.
    • Data Update action: Action that enables updating specific fields in the document during document creation or update. Such updates are run when the rule condition is satisfied.
    • Delete Document Action: Action that enables deleting the document during document update when certain fields meet the deletion criteria defined using rule conditions.
    • Folder inclusion action: Action that automatically adds a new document (or updated document) under specific folders. Such folders can be specified directly using their name.
    • Remove from folder action: Action that automatically removes a new document from given folders when a rule-level condition is satisfied.
    • Access Control Action: Action that allows updating the access control lists (groups and user bindings) during document creation. Such updates are run when the rule condition is satisfied.
    • Publish action: Action that publishes specific messages to the user's Pub/Sub channel when a rule-level condition is satisfied.

Manage RuleSets

Document Warehouse provides APIs to manage RuleSets (Create, Get, Update, Delete, List). This section provides samples for configuring different types for rules.

Create a RuleSet

To create a rule-set, do the following:

REST

Request:

# Create a RuleSet for data validation.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "trigger_type": "ON_CREATE",
      "condition": "documentType == \'W9\' && STATE ==\'CA\'",
      "actions": {
        "data_validation": {
          "conditions": {
            "NAME": "NAME != \'\'",
            "FILING_COST": "FILING_COST > 10.0"
          }
        }
      },
      "enabled": true
    }
  ],
  "description": "W9: Basic validation check rules."
}'

Response

{
  "description": "W9: Basic validation check rules.",
  "name": "RULE_SET_NAME",
  "rules": [
    {
      "actions": [
        {
          "actionId": "de0e6b84-106b-44ba-b1c4-0b3ad6ddc719",
          "dataValidation": {
            "conditions": {
              "FILING_COST": "FILING_COST > 10.0",
              "NAME": "NAME != ''"
            }
          }
        }
      ],
      "condition": "documentType == 'W9' && STATE =='CA'",
      "enabled": true,
      "triggerType": "ON_CREATE"
    }
  ]
}

Python

For more information, see the Document AI Warehouse Python API reference documentation.

To authenticate to Document AI Warehouse, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.


from google.cloud import contentwarehouse

# TODO(developer): Uncomment these variables before running the sample.
# project_number = "YOUR_PROJECT_NUMBER"
# location = "us" # Format is 'us' or 'eu'


def create_rule_set(project_number: str, location: str) -> None:
    # Create a client
    client = contentwarehouse.RuleSetServiceClient()

    # The full resource name of the location, e.g.:
    # projects/{project_number}/locations/{location}
    parent = client.common_location_path(project=project_number, location=location)

    actions = contentwarehouse.Action(
        delete_document_action=contentwarehouse.DeleteDocumentAction(
            enable_hard_delete=True
        )
    )

    rules = contentwarehouse.Rule(
        trigger_type="ON_CREATE",
        condition="documentType == 'W9' && STATE =='CA'",
        actions=[actions],
    )

    rule_set = contentwarehouse.RuleSet(
        description="W9: Basic validation check rules.",
        source="My Organization",
        rules=[rules],
    )

    # Initialize request argument(s)
    request = contentwarehouse.CreateRuleSetRequest(parent=parent, rule_set=rule_set)

    # Make the request
    response = client.create_rule_set(request=request)

    # Handle the response
    print(f"Rule Set Created: {response}")

    # Initialize request argument(s)
    request = contentwarehouse.ListRuleSetsRequest(
        parent=parent,
    )

    # Make the request
    page_result = client.list_rule_sets(request=request)

    # Handle the response
    for response in page_result:
        print(f"Rule Sets: {response}")

Java

For more information, see the Document AI Warehouse Java API reference documentation.

To authenticate to Document AI Warehouse, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import com.google.cloud.contentwarehouse.v1.Action;
import com.google.cloud.contentwarehouse.v1.ActionOrBuilder;
import com.google.cloud.contentwarehouse.v1.CreateRuleSetRequest;
import com.google.cloud.contentwarehouse.v1.CreateRuleSetRequestOrBuilder;
import com.google.cloud.contentwarehouse.v1.DeleteDocumentAction;
import com.google.cloud.contentwarehouse.v1.DeleteDocumentActionOrBuilder;
import com.google.cloud.contentwarehouse.v1.ListRuleSetsRequest;
import com.google.cloud.contentwarehouse.v1.ListRuleSetsRequestOrBuilder;
import com.google.cloud.contentwarehouse.v1.LocationName;
import com.google.cloud.contentwarehouse.v1.Rule;
import com.google.cloud.contentwarehouse.v1.Rule.TriggerType;
import com.google.cloud.contentwarehouse.v1.RuleOrBuilder;
import com.google.cloud.contentwarehouse.v1.RuleSet;
import com.google.cloud.contentwarehouse.v1.RuleSetOrBuilder;
import com.google.cloud.contentwarehouse.v1.RuleSetServiceClient;
import com.google.cloud.contentwarehouse.v1.RuleSetServiceClient.ListRuleSetsPagedResponse;
import com.google.cloud.contentwarehouse.v1.RuleSetServiceSettings;
import com.google.cloud.resourcemanager.v3.Project;
import com.google.cloud.resourcemanager.v3.ProjectName;
import com.google.cloud.resourcemanager.v3.ProjectsClient;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;


public class CreateRuleSet {

  public static void createRuleSet() throws IOException, 
        InterruptedException, ExecutionException, TimeoutException { 
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-project-id";
    String location = "your-region"; // Format is "us" or "eu".
    createRuleSet(projectId, location);
  }

  public static void createRuleSet(String projectId, String location)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    String projectNumber = getProjectNumber(projectId);

    String endpoint = "contentwarehouse.googleapis.com:443";
    if (!"us".equals(location)) {
      endpoint = String.format("%s-%s", location, endpoint);
    }
    RuleSetServiceSettings ruleSetServiceSettings =
        RuleSetServiceSettings.newBuilder().setEndpoint(endpoint).build();

    // Create a Rule Set Service Client 
    try (RuleSetServiceClient ruleSetServiceClient = 
        RuleSetServiceClient.create(ruleSetServiceSettings)) {
      /*  The full resource name of the location, e.g.:
      projects/{project_number}/locations/{location} */
      String parent = LocationName.format(projectNumber, location); 

      // Create a Delete Document Action to be added to the Rule Set 
      DeleteDocumentActionOrBuilder deleteDocumentAction = 
          DeleteDocumentAction.newBuilder().setEnableHardDelete(true).build();

      // Add Delete Document Action to Action Object 
      ActionOrBuilder action = Action.newBuilder()
            .setDeleteDocumentAction((DeleteDocumentAction) deleteDocumentAction).build();

      // Create rule to add to rule set 
      RuleOrBuilder rule = Rule.newBuilder()
          .setTriggerType(TriggerType.ON_CREATE)
          .setCondition("documentType == 'W9' && STATE =='CA' ")
          .addActions(0, (Action) action).build();

      // Create rule set and add rule to it
      RuleSetOrBuilder ruleSetOrBuilder = RuleSet.newBuilder()
          .setDescription("W9: Basic validation check rules.")
          .setSource("My Organization")
          .addRules((Rule) rule).build();

      // Create and prepare rule set request to client
      CreateRuleSetRequestOrBuilder createRuleSetRequest = 
          CreateRuleSetRequest.newBuilder()
              .setParent(parent)
              .setRuleSet((RuleSet) ruleSetOrBuilder).build();

      RuleSet response = ruleSetServiceClient.createRuleSet(
          (CreateRuleSetRequest) createRuleSetRequest);

      System.out.println("Rule set created: " + response.toString());

      ListRuleSetsRequestOrBuilder listRuleSetsRequest = 
          ListRuleSetsRequest.newBuilder()
              .setParent(parent).build();

      ListRuleSetsPagedResponse listRuleSetsPagedResponse = 
          ruleSetServiceClient.listRuleSets((ListRuleSetsRequest) listRuleSetsRequest);

      listRuleSetsPagedResponse.iterateAll().forEach(
          (ruleSet -> System.out.print(ruleSet))
      );
    }
  }

  private static String getProjectNumber(String projectId) throws IOException { 
    try (ProjectsClient projectsClient = ProjectsClient.create()) { 
      ProjectName projectName = ProjectName.of(projectId); 
      Project project = projectsClient.getProject(projectName);
      String projectNumber = project.getName(); // Format returned is projects/xxxxxx
      return projectNumber.substring(projectNumber.lastIndexOf("/") + 1);
    } 
  }
}

List RuleSets

To list rule-sets under a project, do the following:

REST

Request:

# List all rule-sets for a project.
curl -X GET -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets

Response

{
  "ruleSets": [
    {
      "description": "W9: Basic validation check rules.",
      "rules": [
        {
          "triggerType": "ON_CREATE",
          "condition": "documentType == 'W9' && STATE =='CA'",
          "actions": [
            {
              "actionId": "fcf79ae8-9a1f-4462-9262-eb2e7161350c",
              "dataValidation": {
                "conditions": {
                  "NAME": "NAME != ''",
                  "FILING_COST": "FILING_COST > 10.0"
                }
              }
            }
          ],
          "enabled": true
        }
      ],
      "name": "RULE_SET_NAME"
    }
  ]
}

Get a RuleSet

To get a rule-set using the rule-set name, do the following:

REST

Request:

# Get a rule-set using rule-set ID.
curl -X GET -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets/RULE_SET

Response

{
  "description": "W9: Basic validation check rules.",
  "rules": [
    {
      "triggerType": "ON_CREATE",
      "condition": "documentType == 'W9' && STATE =='CA'",
      "actions": [
        {
          "actionId": "7559346b-ec9f-4143-ab1c-1912f5588807",
          "dataValidation": {
            "conditions": {
              "NAME": "NAME != ''",
              "FILING_COST": "FILING_COST > 10.0"
            }
          }
        }
      ],
      "enabled": true
    }
  ],
  "name": "RULE_SET_NAME"
}

Delete a RuleSet

To delete a rule-set using the rule-set name, do the following:

REST

Request:

# Get a rule-set using rule-set ID.
curl -X DELETE -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets/RULE_SET

Rule Actions

This section will go over the rule expressions and each rule action examples.

Sample conditions

A condition refers to the expression specified using Common Expression Language.

Examples:

  • String field expression
    • STATE == \'CA\'. Check if the value of STATE field is equal to CA
    • NAME != \'\'. Check that the value of the NAME field is not empty.
  • Numeric field expression
    • FILING_COST > 10.0. Check if the value of the FILING_COST field (defined as float) is greater than 10.0.

How to check if a document belongs to a specific schema

To refer to a specific schema type, use the special field name documentType (it is a reserved word). It is evaluated against the DisplayName field in the DocumentSchema.

Example:

  • documentType == \'W9\'

The preceding condition checks if the schema of the document (using the keyword documentType) has a display name of W9.

How to refer to old/existing document property values and new document property values

To support conditions that include existing and newly given properties, use the following two prefixes with a DOT operator to access the specific version of the property:

  • OLD_ to refer to existing document properties.
  • NEW_ to refer to new document properties in the request.

Example:

  • OLD_.state == \'TX\' && NEW_.state == \'CA\' Checks that the existing value of the state property is TX and the new value given is CA.

Date fields handling

For the DriverLicense Document, if the EXPIRATION_DATE is before a certain date

  • Update (or add new if absent) EXPIRATION_STATUS (enum field) with a value equal to EXPIRING_BEFORE_CLOSING_DATE.

To add date values use the timestamp function as shown in the following example.

REST

Request:

# Check if document expires before a date and update the status field
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules":[
    {
      "trigger_type": "ON_CREATE",
      "description": "Expiration date check rule",
      "condition": "documentType==\'DriverLicense\' && EXPIRATION_DATE  < timestamp(\'2021-08-01T00:00:00Z\')",
      "actions": {
        "data_update": {
          "entries": {
            "EXPIRATION_STATUS": "EXPIRING_BEFORE_CLOSING_DATE"
          }
        }
      }
    }
  ]
}'

Data validation rule

Validate a W9 document for the STATE (text field) California:

  • Check that the NAME (text field) is non-empty.
  • Check that the FILING_COST (float field) is greater than 10.0.

REST

Request:

# Rules for data validation.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "trigger_type": "ON_CREATE",
      "condition": "documentType == \'W9\' && STATE ==\'CA\'",
      "actions": {
        "data_validation": {
          "conditions": {
            "NAME": "NAME != \'\'",
            "FILING_COST": "FILING_COST > 10.0"
          }
        }
      },
      "enabled": true
    }
  ],
  "description": "W9: Basic validation check rules."
}'

Data update rule

For a W9 document, if the BUSINESS_NAME field is Google:

  • Update (or add new if absent) an Address field equal to 1600 Amphitheatre Pkwy.
  • Update (or add new if absent) an EIN field equal to 77666666.

REST

Request:

# Rule for data update.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules":[
    {
      "description": "W9: Rule to update address data and EIN.",
      "trigger_type": "ON_CREATE",
      "condition": "documentType==\'W9\' && BUSINESS_NAME == \'Google\'",
      "actions": {
        "data_update": {
          "entries": {
            "Address": "1600 Amphitheatre Pkwy",
            "EIN": "776666666"
          }
        }
      }
    }
  ]
}'

Document delete rule

While updating the W9 document, if the BUSINESS_NAME field is changed to Google then delete the document.

REST

Request:

# Rule for deleting the document
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "description": "W9: Rule to delete the document during update.",
      "trigger_type": "ON_UPDATE",
      "condition": "documentType == \'W9\' && BUSINESS_NAME == \'Google\'",
      "actions": {
        "delete_document_action": {
          "enable_hard_delete": true
        }
      }
    }
  ]
}'

Access control rule

While updating the W9 document, if the BUSINESS_NAME field is Google then update policy bindings that control access to document

Add new binding

When a document satisfies the rule condition:

  • Adds the Editor role for user:a@example.com and group:xxx@example.com
  • Adds the Viewer role for user:b@example.com and group:yyy@example.com

REST

Request:

# Rule for adding new policy binding while creating the document.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "description": "W9: Rule to add new policy binding."
      "trigger_type": "ON_CREATE",
      "condition": "documentType == \'aca13aa9-6d0d-4b6b-a1eb-315dcb876bd1\' && BUSINESS_NAME == \'Google\'",
      "actions": {
        "access_control": {
          "operation_type": "ADD_POLICY_BINDING",
          "policy": {
            "bindings": [
              {
                "role": "roles/contentwarehouse.documentEditor",
                "members": ["user:a@example.com", "group:xxx@example.com"]
              },
              {
                "role": "roles/contentwarehouse.documentViewer",
                "members": ["user:b@example.com", "group:yyy@example.com"]
              }
            ]
          }
        }
      }
    }
  ]
}'

Replace an existing binding

When a document satisfies the rule condition, replace the existing binding to include only the Editor role for user:a@example.com and group:xxx@example.com.

REST

Request:

# Rule for replacing existing policy bindings with newly given bindings.
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "description": "W9: Rule to replace policy binding."
      "trigger_type": "ON_CREATE",
      "condition": "documentType == \'a9e37d07-9cfa-4b4d-b372-53162e3b8bd9\' && BUSINESS_NAME == \'Google\'",
      "actions": {
        "access_control": {
          "operation_type": "REPLACE_POLICY_BINDING",
          "policy": {
            "bindings": [
              {
                "role": "roles/contentwarehouse.documentEditor",
                "members": ["user:a@example.com", "group:xxx@example.com"]
              }
            ]
          }
        }
      }
    }
  ]
}'

Add to folder rule

When a folder is created or updated, it can be added under predefined static folders or folders matching certain search criteria.

Configure static folders

When a new DriverLicense is created, then add it under the already created folder.

REST

Request:

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "trigger_type": "ON_CREATE",
      "condition": "documentType == \'DriverLicense\'",
      "actions": {
        "add_to_folder": {
          "folders": ["projects/821411934445/locations/us/documents/445en119hqp70"]
        }
      }
    }
  ]
}'

Publish to Pub/Sub

When a document is created or updated, or a link is created or deleted, you can push a notification message to the Pub/Sub channel.

Steps to use

  • Create a Pub/Sub topic in Customer project.
  • Create a rule to trigger publish Pub/Sub action using the following request. (See the following example.)
  • Invoke Document AI Warehouse APIs.
  • Verify that messages are published on Pub/Sub channel.

Example rule

When a document is added under a folder (CreateLink API is invoked), the following rule can be used to send notification messages to Pub/Sub topic.

REST

Request:

curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
https://contentwarehouse.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION/ruleSets \
-d '{
  "rules": [
    {
      "trigger_type": "ON_CREATE_LINK",
      "condition": "documentType == \'DriverLicenseFolder\'",
      "actions": {
        "publish_to_pub_sub": {
          "topic_id": "<topic_name>"
          "messages": "Added document under a folder."
        }
      }
    }
  ]
}'

Rule details

  • This action is supported for the following trigger types:

    • ON_CRATE: When new document is created.
    • ON_UPDATE: When document is updated.
    • ON_CRATE_LINK: When a new link is created.
    • ON_DELETE_LINK: When a link is deleted.
  • For Create and Update Document triggers, the condition can include attributes of the document being created or updated.

  • For Create and Delete Link triggers, the condition can only include attributes of the Folder document from which the document is being added or removed.

  • Themessages field can be used to send list of messages to Pub/Sub channel. Note that along with these messages, by default, the following fields are also published:

    • Schema name, Document name, Trigger type, RuleSet name, Rule id, Action id.
    • For Create and Delete link triggers, the notifications include relevant link information that is being added or deleted.