Google Cloud Document Ai V1 Client - Class OcrConfig (2.0.0-RC1)

Reference documentation and code samples for the Google Cloud Document Ai V1 Client class OcrConfig.

Config for Document OCR.

Generated from protobuf message google.cloud.documentai.v1.OcrConfig

Namespace

Google \ Cloud \ DocumentAI \ V1

Methods

__construct

Constructor.

Parameters
Name Description
data array

Optional. Data for populating the Message object.

↳ hints Google\Cloud\DocumentAI\V1\OcrConfig\Hints

Hints for the OCR model.

↳ enable_native_pdf_parsing bool

Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.

↳ enable_image_quality_scores bool

Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input. Adds additional latency comparable to regular OCR to the process call.

↳ advanced_ocr_options array

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are: - legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.

↳ enable_symbol bool

Includes symbol level OCR information if set to true.

↳ compute_style_info bool

Turn on font identification model and return font style information. Deprecated, use PremiumFeatures.compute_style_info instead.

↳ disable_character_boxes_detection bool

Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.

↳ premium_features Google\Cloud\DocumentAI\V1\OcrConfig\PremiumFeatures

Configurations for premium OCR features.

getHints

Hints for the OCR model.

Returns
Type Description
Google\Cloud\DocumentAI\V1\OcrConfig\Hints|null

hasHints

clearHints

setHints

Hints for the OCR model.

Parameter
Name Description
var Google\Cloud\DocumentAI\V1\OcrConfig\Hints
Returns
Type Description
$this

getEnableNativePdfParsing

Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.

Returns
Type Description
bool

setEnableNativePdfParsing

Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.

Parameter
Name Description
var bool
Returns
Type Description
$this

getEnableImageQualityScores

Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input.

Adds additional latency comparable to regular OCR to the process call.

Returns
Type Description
bool

setEnableImageQualityScores

Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input.

Adds additional latency comparable to regular OCR to the process call.

Parameter
Name Description
var bool
Returns
Type Description
$this

getAdvancedOcrOptions

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

  • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm.

Customers can choose the best suitable layout algorithm based on their situation.

Returns
Type Description
Google\Protobuf\Internal\RepeatedField

setAdvancedOcrOptions

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

  • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm.

Customers can choose the best suitable layout algorithm based on their situation.

Parameter
Name Description
var string[]
Returns
Type Description
$this

getEnableSymbol

Includes symbol level OCR information if set to true.

Returns
Type Description
bool

setEnableSymbol

Includes symbol level OCR information if set to true.

Parameter
Name Description
var bool
Returns
Type Description
$this

getComputeStyleInfo

Turn on font identification model and return font style information.

Deprecated, use PremiumFeatures.compute_style_info instead.

Returns
Type Description
bool

setComputeStyleInfo

Turn on font identification model and return font style information.

Deprecated, use PremiumFeatures.compute_style_info instead.

Parameter
Name Description
var bool
Returns
Type Description
$this

getDisableCharacterBoxesDetection

Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.

Returns
Type Description
bool

setDisableCharacterBoxesDetection

Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.

Parameter
Name Description
var bool
Returns
Type Description
$this

getPremiumFeatures

Configurations for premium OCR features.

Returns
Type Description
Google\Cloud\DocumentAI\V1\OcrConfig\PremiumFeatures|null

hasPremiumFeatures

clearPremiumFeatures

setPremiumFeatures

Configurations for premium OCR features.

Parameter
Name Description
var Google\Cloud\DocumentAI\V1\OcrConfig\PremiumFeatures
Returns
Type Description
$this