Manage access control

This page provides an overview of how to manage access control for projects and documents.

Data Access Control overview

Data Access Control is a key feature of Document AI Warehouse. It controls who has access to which resource in Document AI Warehouse, and what level of access they have.

Document AI Warehouse APIs are built on Google Cloud. HTTPS is used to ensure secure data transmission over the internet. Authentication and authorization are enforced on the Document AI Warehouse APIs to protect the service and user data based on Google identities.

Document AI Warehouse APIs use OAuth2 for authentication with a user account. All the API methods require the https://www.googleapis.com/auth/cloud-platform OAuth scope.

Document AI Warehouse enforces access control for customer data, based on Cloud IAM. Document AI Warehouse defines a set of roles and associated permissions for you to restrict different users' access to the data stored in our service. For more information, see the IAM roles and permissions section.

Use a service account to enforce basic access control

You need a service account granted with the required permissions to access the Document AI Warehouse API. If you go through the "Provision through Google Cloud console" step in the Quickstart guide, a service account is automatically provisioned with the Document AI Warehouse Admin role.

Access control mode

Document AI Warehouse provides three access control modes:

  • Universal access: No document-level access control
  • Document-level access control with your own identity service
  • Document-level access control with Cloud Identity

Users need to choose one of the access modes during the provisioning process. The following sections outline the difference between the three access control modes, and demonstrate how to enable each mode.

Universal access

Universal access control lets you use Identity and Access Management (IAM) alone to manage permissions. IAM applies the same permissions to all documents under the project with the authenticated identity.

In this mode, when you have finished the provisioning procedure in the quickstart guide, you and all of your users are able to access all the documents under the selected Google Cloud project in the Document AI Warehouse service using the service account, with the permissions associated with the service account.

The rest of this document discusses document-level access control. If you are using universal access, feel free to skip the rest of the document.

Document-level access control

For Document AI Warehouse users, you can either:

  • Bring your own identity service
    • Both the end user and end-user membership groups are required in the request metadata. If your company has its own way of authenticating the user and identifying what groups the user belongs to, use this option.
  • Use Cloud Identity
    • Only the end user is required in the request metadata because Document AI Warehouse collects the membership groups from Cloud Identity for customers. The difference between this and using a custom identity service is that you manage the user's group memberships using Cloud Identity versus an in-house system.

There are a few limitations with using the document-level access mode:

  • Only members and roles in the ACL are supported. IAM conditions are ignored.
  • Custom roles are not supported in the ACL.
  • Document AI Warehouse does not verify end-user credentials. Document AI Warehouse only verifies the service account credentials to make sure the calls are from the customers. End user credentials need to be verified on the customer's side.
  • Customers need to provide the end user (and all the groups that the end user is a member of if not using the Cloud Identity option) in the request metadata to enforce the access control.
  • The number of membership groups for the end user should be less than 100.

Document-level access control with the customer's own identity service

You can choose this mode if you want to do the following:

  • Grant end user (groups) different permissions to access each of the documents.
  • Use your own identity service.

This mode enables you to use IAM and access control lists (ACLs) together to manage permissions. Each document in Document AI Warehouse can be configured with a specific document-level ACL. The authentication and authorization happens as follows:

  • The service account credential is authenticated and authorized to access the service.
  • In the request metadata, include the end user and end-user membership groups. Either the end user or at least one of the groups the end user belongs to needs to have permission to access the document.

Document AI Warehouse grants access to the requested document only if both conditions in the preceding list are satisfied.

The UserInfo (including end user ID and user membership group IDs) of the RequestMetadata provided in the API call is used to validate if the end user is allowed to perform the corresponding action against the document resource requested. For example, the UserInfo provided in the GetDocument API is used to validate if the end user is allowed to view the document. If either the end user or one of the membership groups is allowed to view the document, then the end user is allowed to view the document.

Sample RequestMetadata in JSON format:

request_metadata: {
    user_info: {
        id: user:fake_user_id
        group_ids: [
            group:fake_group_id_1,
            group:fake_group_id_2,
            group:fake_group_id_3,
        ]
    }
}

In addition to following the Quickstart guide, this access control mode requires a few additional steps before you start sending APIs to Document AI Warehouse:

  1. Fetch group memberships for a given end user from your directory service (for example, Azure Active Directory or Okta).
  2. Follow the instructions under the Configure access control section to set a default project policy. You could also set a document-level ACL for specific documents after creation.

After completing the preceding steps, you are now ready to use the service account to make API calls to Document AI Warehouse with end user and group membership info in the RequestMetadata section of the request body.

In this mode, you should deploy a proxy to authenticate and authorize the end users. The proxy uses the service account granted with the admin role to access the service. The service account key should be protected so that it is only used by the proxy.

As an out-of-box solution, the Document AI Warehouse console is a proxy that can store the service account key, authenticate the end users through the Google identities, and forward the requests to Document AI Warehouse.

Document-level access control with Cloud Identity

As an alternative to using your own identity service, you could also opt in to use Cloud Identity to simplify the process.

To centrally manage users and groups, Google Cloud customers can set up Cloud Identity from scratch or federate identities between Google and other identity providers, such as Active Directory and Azure Active Directory.

The UserInfo section of RequestMetadata provided in the API call is used to validate if the end user is allowed to perform the corresponding action against the document resource requested. Using Cloud Identity, only the end user ID is required in the RequestMetadata, and Document AI Warehouse collects the membership group information from the Cloud Identity service. If either the end user or one of the membership groups is allowed to access the document, then the end user is allowed to access the document.

Sample RequestMetadata in JSON format:

request_metadata: {
    user_info: {
        id: user:fake_user_id
    }
}

In addition to following the Quickstart guide, this access control mode requires a few additional steps before you start sending requests to Document AI Warehouse:

  1. Integrate with Cloud Identity for the end users and groups.
  2. Follow the instructions under the Configure access control section to set a default project policy. You could also set a document-level ACL for specific documents after creation.

After completing the preceding steps, you are now ready to use the service account to make API calls to Document AI Warehouse with end-user information in the RequestMetadata section of the request body.

Configure access control

Before you begin

Before you begin, make sure you have completed the Quickstart page.

SetAcl and FetchAcl

When a new project is created, no project ACL is set. The project owner can call the Document AI Warehouse SetAcl API to set a default project policy using predefined roles for the project by setting the projectOwner field to true using the service account. Members in the project policy have access to all the documents under the project depending on the roles granted. You can grant admin users or groups the access in the default project policy.

The following is a table that summarizes the required role for each document action. For more information about the permissions granted to each role, see IAM roles and permissions.

To make calls to the Document Schema API using the service account, see projects.locations.documentSchemas.

Document API method Required roles
CreateDocument roles/contentwarehouse.documentCreator
UpdateDocument roles/contentwarehouse.documentEditor
DeleteDocument
SetACL
roles/contentwarehouse.documentAdmin
GetDocument
FetchACL
SearchDocuments
roles/contentwarehouse.documentViewer

CreateDocument

Grant the end user or group Creator access if not granted:

  • [Optional] Fetch membership groups for the end user Admin from the customer's identity service. This step can be skipped for customers using Cloud Identity.
  • Grant end user A (or the group that user A is a member of) the role roles/contentwarehouse.documentCreator at the project level by making the call to SetAcl using the service account with end user Admin [and membership groups] in the request metadata. The end user Admin has documentAdmin access at the project level.

Create a document:

  • Optional: Fetch membership groups for end user A from your identity service. This step can be skipped if you use Cloud Identity.
  • Make the call to CreateDocument with the end user A [and membership groups] in the request metadata to create a document using the service account. After the document is created, end user A can view and edit the document by default. Customers can also specify a default policy to grant users or groups the access during the creation. For example, granting groupX the documentViewer access, groupY the documentEditor access, and groupZ the documentAdmin access.

GetDocument and FetchAcl

After the document is created, end user A or the members of groupX, groupY, or groupZ are able to call GetDocument to view the document, or call FetchAcl to view the ACL of the document. Here are the steps:

  1. Optional: Fetch membership groups for end user A from your identity service. This step can be skipped if you use Cloud Identity.
  2. Make the call to GetDocument or FetchAcl using the service account with end user A (and membership groups) in the request metadata.

The call from end user B is rejected if B is not a member of groupX, groupY, or groupZ.

UpdateDocument, DeleteDocument, and SetAcl

After the document is created, only the end user A or members of groupY or groupZ are allowed to call UpdateDocument to update the document; only the end user A or members of groupZ are allowed to call DeleteDocument to delete the document or SetAcl to share the document with other end users or groups. Here are the steps:

  1. Optional: Fetch membership groups for end user A from your identity service. This step can be skipped if you use Cloud Identity.
  2. Make the call to UpdateDocument, DeleteDocument, or SetAcl using the service account with end user A [and membership groups] in the request metadata.

The call from members of groupX will be denied because they only have documentViewer access to the document.

SearchDocuments

The documents returned depend on the roles granted to the end user. For example, for an empty search query, all documents under the project will be returned if the end user has documentViewer access at the project level. Otherwise, only the documents with contentwarehouse.documents.get permission for the given end user are returned.

To make a call to the SearchDocument API, customers need to perform the following steps.

  1. Optional: Fetch membership groups for end user A from your identity service. This step can be skipped if you use Cloud Identity.
  2. Make the call to SearchDocument using the service account with end user A (and membership groups) in the request metadata.
Document Link API method Required roles
CreateDocumentLink Source: roles/contentwarehouse.documentEditor
Target: roles/contentwarehouse.documentViewer
ListLinkedTargets
ListLinkedSources
roles/contentwarehouse.documentViewer
DeleteDocumentLink Source: roles/contentwarehouse.documentEditor

End users are able to link document doc1 and document doc2 if the end users have contentwarehouse.documents.update permission for doc1 and contentwarehouse.documents.get permission for doc2.

ListLinkedTargets and ListLinkedSources

End users can only list the target or source documents with contentwarehouse.documents.get permission.

End users are able to delete the links if they have contentwarehouse.documents.update permission on the source documents.