Method: dataset.importDocuments

Full name: projects.locations.processors.dataset.importDocuments

Import documents into a dataset.

HTTP request

POST https://{endpoint}/v1beta3/{dataset}:importDocuments

Where {endpoint} is one of the supported service endpoints.

Path parameters

Parameters
dataset

string

Required. The dataset resource name. Format: projects/{project}/locations/{location}/processors/{processor}/dataset It takes the form projects/{project}/locations/{location}/processors/{processor}/dataset.

Request body

The request body contains data with the following structure:

JSON representation
{
  "batchDocumentsImportConfigs": [
    {
      object (BatchDocumentsImportConfig)
    }
  ]
}
Fields
batchDocumentsImportConfigs[]

object (BatchDocumentsImportConfig)

Required. The Cloud Storage uri containing raw documents that must be imported.

Response body

If successful, the response body contains an instance of Operation.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the dataset resource:

  • documentai.datasets.createDocuments

For more information, see the IAM documentation.

BatchDocumentsImportConfig

Config for importing documents. Each batch can have its own dataset split type.

JSON representation
{
  "batchInputConfig": {
    object (BatchDocumentsInputConfig)
  },

  // Union field split_type_config can be only one of the following:
  "datasetSplit": enum (DatasetSplitType),
  "autoSplitConfig": {
    object (AutoSplitConfig)
  }
  // End of list of possible types for union field split_type_config.
}
Fields
batchInputConfig

object (BatchDocumentsInputConfig)

The common config to specify a set of documents used as input.

Union field split_type_config.

split_type_config can be only one of the following:

datasetSplit

enum (DatasetSplitType)

Target dataset split where the documents must be stored.

autoSplitConfig

object (AutoSplitConfig)

If set, documents will be automatically split into training and test split category with the specified ratio.

AutoSplitConfig

The config for auto-split.

JSON representation
{
  "trainingSplitRatio": number
}
Fields
trainingSplitRatio

number

Ratio of training dataset split.