- Resource: DataStore
- Methods
Resource: DataStore
DataStore captures global settings and configs at the DataStore level.
JSON representation |
---|
{ "name": string, "displayName": string, "industryVertical": enum ( |
Fields | |
---|---|
name |
Immutable. The full resource name of the data store. Format: This field must be a UTF-8 encoded string with a length limit of 1024 characters. |
displayName |
Required. The data store display name. This field must be a UTF-8 encoded string with a length limit of 128 characters. Otherwise, an INVALID_ARGUMENT error is returned. |
industryVertical |
Immutable. The industry vertical that the data store registers. |
solutionTypes[] |
The solutions that the data store enrolls. Available solutions for each
|
defaultSchemaId |
Output only. The id of the default |
contentConfig |
Immutable. The content config of the data store. If this field is unset, the server behavior defaults to |
createTime |
Output only. Timestamp the A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: |
languageInfo |
Language info for DataStore. |
documentProcessingConfig |
Configuration for Document understanding and enrichment. |
startingSchema |
The start schema to use for this This field is only used by [dataStores.create][] API, and will be ignored if used in other APIs. This field will be omitted from all API responses including [dataStores.create][] API. To retrieve a schema of a The provided schema will be validated against certain rules on schema. Learn more from this doc. |
ContentConfig
Content config of the data store.
Enums | |
---|---|
CONTENT_CONFIG_UNSPECIFIED |
Default value. |
NO_CONTENT |
Only contains documents without any Document.content . |
CONTENT_REQUIRED |
Only contains documents with Document.content . |
PUBLIC_WEBSITE |
The data store is used for public website search. |
LanguageInfo
Language info for DataStore.
JSON representation |
---|
{ "languageCode": string, "normalizedLanguageCode": string, "language": string, "region": string } |
Fields | |
---|---|
languageCode |
The language code for the DataStore. |
normalizedLanguageCode |
Output only. This is the normalized form of languageCode. E.g.: languageCode of |
language |
Output only. Language part of normalizedLanguageCode. E.g.: |
region |
Output only. Region part of normalizedLanguageCode, if present. E.g.: |
DocumentProcessingConfig
A singleton resource of DataStore
. It's empty when DataStore
is created, which defaults to digital parser. The first call to [DataStoreService.UpdateDocumentProcessingConfig][] method will initialize the config.
JSON representation |
---|
{ "name": string, "chunkingConfig": { object ( |
Fields | |
---|---|
name |
The full resource name of the Document Processing Config. Format: |
chunkingConfig |
Whether chunking mode is enabled. |
defaultParsingConfig |
Configurations for default Document parser. If not specified, we will configure it as default DigitalParsingConfig, and the default parsing config will be applied to all file types for Document parsing. |
parsingConfigOverrides |
Map from file type to override the default parsing configuration based on the file type. Supported keys:
|
ChunkingConfig
Configuration for chunking config.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field chunk_mode . Additional configs that defines the behavior of the chunking. chunk_mode can be only one of the following: |
|
layoutBasedChunkingConfig |
Configuration for the layout based chunking. |
LayoutBasedChunkingConfig
Configuration for the layout based chunking.
JSON representation |
---|
{ "chunkSize": integer, "includeAncestorHeadings": boolean } |
Fields | |
---|---|
chunkSize |
The token size limit for each chunk. Supported values: 100-500 (inclusive). Default value: 500. |
includeAncestorHeadings |
Whether to include appending different levels of headings to chunks from the middle of the document to prevent context loss. Default value: False. |
ParsingConfig
Related configurations applied to a specific type of document parser.
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field type_dedicated_config . Configs for document processing types. type_dedicated_config can be only one of the following: |
|
digitalParsingConfig |
Configurations applied to digital parser. |
ocrParsingConfig |
Configurations applied to OCR parser. Currently it only applies to PDFs. |
layoutParsingConfig |
Configurations applied to layout parser. |
DigitalParsingConfig
This type has no fields.
The digital parsing configurations for documents.
OcrParsingConfig
The OCR parsing configurations for documents.
JSON representation |
---|
{ "enhancedDocumentElements": [ string ], "useNativeText": boolean } |
Fields | |
---|---|
enhancedDocumentElements[] |
[DEPRECATED] This field is deprecated. To use the additional enhanced document elements processing, please switch to |
useNativeText |
If true, will use native text instead of OCR text on pages containing native text. |
LayoutParsingConfig
This type has no fields.
The layout parsing configurations for documents.
Methods |
|
---|---|
|
Completes the specified user input with keyword suggestions. |
|
Creates a DataStore . |
|
Deletes a DataStore . |
|
Gets a DataStore . |
|
Gets the SiteSearchEngine . |
|
Lists all the DataStore s associated with the project. |
|
Updates a DataStore |
|
Trains a custom model. |