Class DocumentUnderstandingServiceClient (0.3.0)

DocumentUnderstandingServiceClient(*, credentials: Optional[google.auth.credentials.Credentials] = None, transport: Optional[Union[str, google.cloud.documentai_v1beta2.services.document_understanding_service.transports.base.DocumentUnderstandingServiceTransport]] = None, client_options: Optional[google.api_core.client_options.ClientOptions] = None, client_info: google.api_core.gapic_v1.client_info.ClientInfo = <google.api_core.gapic_v1.client_info.ClientInfo object>)

Service to parse structured information from unstructured or semi-structured documents using state-of-the-art Google AI such as natural language, computer vision, and translation.

Methods

DocumentUnderstandingServiceClient

DocumentUnderstandingServiceClient(*, credentials: Optional[google.auth.credentials.Credentials] = None, transport: Optional[Union[str, google.cloud.documentai_v1beta2.services.document_understanding_service.transports.base.DocumentUnderstandingServiceTransport]] = None, client_options: Optional[google.api_core.client_options.ClientOptions] = None, client_info: google.api_core.gapic_v1.client_info.ClientInfo = <google.api_core.gapic_v1.client_info.ClientInfo object>)

Instantiate the document understanding service client.

Parameters
NameDescription
credentials Optional[google.auth.credentials.Credentials]

The authorization credentials to attach to requests. These credentials identify the application to the service; if none are specified, the client will attempt to ascertain the credentials from the environment.

transport Union[str, .DocumentUnderstandingServiceTransport]

The transport to use. If set to None, a transport is chosen automatically.

client_options client_options_lib.ClientOptions

Custom options for the client. It won't take effect if a transport instance is provided. (1) The api_endpoint property can be used to override the default endpoint provided by the client. GOOGLE_API_USE_MTLS_ENDPOINT environment variable can also be used to override the endpoint: "always" (always use the default mTLS endpoint), "never" (always use the default regular endpoint) and "auto" (auto switch to the default mTLS endpoint if client certificate is present, this is the default value). However, the api_endpoint property takes precedence if provided. (2) If GOOGLE_API_USE_CLIENT_CERTIFICATE environment variable is "true", then the client_cert_source property can be used to provide client certificate for mutual TLS transport. If not provided, the default SSL client certificate will be used if present. If GOOGLE_API_USE_CLIENT_CERTIFICATE is "false" or not set, no client certificate will be used.

client_info google.api_core.gapic_v1.client_info.ClientInfo

The client info used to send a user-agent string along with API requests. If None, then default info will be used. Generally, you only need to set this if you're developing your own client library.

Exceptions
TypeDescription
google.auth.exceptions.MutualTLSChannelErrorIf mutual TLS transport creation failed for any reason.

batch_process_documents

batch_process_documents(request: Optional[google.cloud.documentai_v1beta2.types.document_understanding.BatchProcessDocumentsRequest] = None, *, requests: Optional[Sequence[google.cloud.documentai_v1beta2.types.document_understanding.ProcessDocumentRequest]] = None, retry: google.api_core.retry.Retry = <_MethodDefault._DEFAULT_VALUE: <object object>>, timeout: Optional[float] = None, metadata: Sequence[Tuple[str, str]] = ())

LRO endpoint to batch process many documents. The output is written to Cloud Storage as JSON in the [Document] format.

Parameters
NameDescription
request .document_understanding.BatchProcessDocumentsRequest

The request object. Request to batch process documents as an asynchronous operation. The output is written to Cloud Storage as JSON in the [Document] format.

requests :class:Sequence[.document_understanding.ProcessDocumentRequest]

Required. Individual requests for each document. This corresponds to the requests field on the request instance; if request is provided, this should not be set.

retry google.api_core.retry.Retry

Designation of what errors, if any, should be retried.

timeout float

The timeout for this request.

metadata Sequence[Tuple[str, str]]

Strings which should be sent along with the request as metadata.

Returns
TypeDescription
.operation.OperationAn object representing a long-running operation. The result type for the operation will be
.document_understanding.BatchProcessDocumentsResponse
: Response to an batch document processing request. This is returned in the LRO Operation after the operation is complete.

from_service_account_file

from_service_account_file(filename: str, *args, **kwargs)

Creates an instance of this client using the provided credentials file.

Parameter
NameDescription
filename str

The path to the service account private key json file.

Returns
TypeDescription
{@api.name}The constructed client.

from_service_account_json

from_service_account_json(filename: str, *args, **kwargs)

Creates an instance of this client using the provided credentials file.

Parameter
NameDescription
filename str

The path to the service account private key json file.

Returns
TypeDescription
{@api.name}The constructed client.

process_document

process_document(request: Optional[google.cloud.documentai_v1beta2.types.document_understanding.ProcessDocumentRequest] = None, *, retry: google.api_core.retry.Retry = <_MethodDefault._DEFAULT_VALUE: <object object>>, timeout: Optional[float] = None, metadata: Sequence[Tuple[str, str]] = ())

Processes a single document.

Parameters
NameDescription
request .document_understanding.ProcessDocumentRequest

The request object. Request to process one document.

retry google.api_core.retry.Retry

Designation of what errors, if any, should be retried.

timeout float

The timeout for this request.

metadata Sequence[Tuple[str, str]]

Strings which should be sent along with the request as metadata.

Returns
TypeDescription
.document.DocumentDocument represents the canonical document resource in Document Understanding AI. It is an interchange format that provides insights into documents and allows for collaboration between users and Document Understanding AI to iterate and optimize for quality.