Google Cloud Discovery Engine V1 Client - Class LayoutParsingConfig (1.6.0)

Reference documentation and code samples for the Google Cloud Discovery Engine V1 Client class LayoutParsingConfig.

The layout parsing configurations for documents.

Generated from protobuf message google.cloud.discoveryengine.v1.DocumentProcessingConfig.ParsingConfig.LayoutParsingConfig

Namespace

Google \ Cloud \ DiscoveryEngine \ V1 \ DocumentProcessingConfig \ ParsingConfig

Methods

__construct

Constructor.

Parameters
Name Description
data array

Optional. Data for populating the Message object.

↳ enable_table_annotation bool

Optional. If true, the LLM based annotation is added to the table during parsing.

↳ enable_image_annotation bool

Optional. If true, the LLM based annotation is added to the image during parsing.

↳ structured_content_types array

Optional. Contains the required structure types to extract from the document. Supported values: * * shareholder-structure

↳ exclude_html_elements array

Optional. List of HTML elements to exclude from the parsed content.

↳ exclude_html_classes array

Optional. List of HTML classes to exclude from the parsed content.

↳ exclude_html_ids array

Optional. List of HTML ids to exclude from the parsed content.

getEnableTableAnnotation

Optional. If true, the LLM based annotation is added to the table during parsing.

Returns
Type Description
bool

setEnableTableAnnotation

Optional. If true, the LLM based annotation is added to the table during parsing.

Parameter
Name Description
var bool
Returns
Type Description
$this

getEnableImageAnnotation

Optional. If true, the LLM based annotation is added to the image during parsing.

Returns
Type Description
bool

setEnableImageAnnotation

Optional. If true, the LLM based annotation is added to the image during parsing.

Parameter
Name Description
var bool
Returns
Type Description
$this

getStructuredContentTypes

Optional. Contains the required structure types to extract from the document. Supported values:

  • shareholder-structure
Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setStructuredContentTypes

Optional. Contains the required structure types to extract from the document. Supported values:

  • shareholder-structure
Parameter
Name Description
var string[]
Returns
Type Description
$this

getExcludeHtmlElements

Optional. List of HTML elements to exclude from the parsed content.

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setExcludeHtmlElements

Optional. List of HTML elements to exclude from the parsed content.

Parameter
Name Description
var string[]
Returns
Type Description
$this

getExcludeHtmlClasses

Optional. List of HTML classes to exclude from the parsed content.

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setExcludeHtmlClasses

Optional. List of HTML classes to exclude from the parsed content.

Parameter
Name Description
var string[]
Returns
Type Description
$this

getExcludeHtmlIds

Optional. List of HTML ids to exclude from the parsed content.

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setExcludeHtmlIds

Optional. List of HTML ids to exclude from the parsed content.

Parameter
Name Description
var string[]
Returns
Type Description
$this