Google Cloud Document Ai V1 Client - Class OcrConfig (1.9.0)

Reference documentation and code samples for the Google Cloud Document Ai V1 Client class OcrConfig.

Config for Document OCR.

Generated from protobuf message google.cloud.documentai.v1.OcrConfig

Namespace

Google \ Cloud \ DocumentAI \ V1

Methods

__construct

Constructor.

Parameters
NameDescription
data array

Optional. Data for populating the Message object.

↳ hints Google\Cloud\DocumentAI\V1\OcrConfig\Hints

Hints for the OCR model.

↳ enable_native_pdf_parsing bool

Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.

↳ enable_image_quality_scores bool

Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input. Adds additional latency comparable to regular OCR to the process call.

↳ advanced_ocr_options array

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are: - legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.

↳ enable_symbol bool

Includes symbol level OCR information if set to true.

↳ compute_style_info bool

Turn on font identification model and return font style information. Deprecated, use PremiumFeatures.compute_style_info instead.

↳ disable_character_boxes_detection bool

Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.

↳ premium_features Google\Cloud\DocumentAI\V1\OcrConfig\PremiumFeatures

Configurations for premium OCR features.

getHints

Hints for the OCR model.

Returns
TypeDescription
Google\Cloud\DocumentAI\V1\OcrConfig\Hints|null

hasHints

clearHints

setHints

Hints for the OCR model.

Parameter
NameDescription
var Google\Cloud\DocumentAI\V1\OcrConfig\Hints
Returns
TypeDescription
$this

getEnableNativePdfParsing

Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.

Returns
TypeDescription
bool

setEnableNativePdfParsing

Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.

Parameter
NameDescription
var bool
Returns
TypeDescription
$this

getEnableImageQualityScores

Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input.

Adds additional latency comparable to regular OCR to the process call.

Returns
TypeDescription
bool

setEnableImageQualityScores

Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input.

Adds additional latency comparable to regular OCR to the process call.

Parameter
NameDescription
var bool
Returns
TypeDescription
$this

getAdvancedOcrOptions

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

  • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm.

Customers can choose the best suitable layout algorithm based on their situation.

Returns
TypeDescription
Google\Protobuf\Internal\RepeatedField

setAdvancedOcrOptions

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

  • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm.

Customers can choose the best suitable layout algorithm based on their situation.

Parameter
NameDescription
var string[]
Returns
TypeDescription
$this

getEnableSymbol

Includes symbol level OCR information if set to true.

Returns
TypeDescription
bool

setEnableSymbol

Includes symbol level OCR information if set to true.

Parameter
NameDescription
var bool
Returns
TypeDescription
$this

getComputeStyleInfo

Turn on font identification model and return font style information.

Deprecated, use PremiumFeatures.compute_style_info instead.

Returns
TypeDescription
bool

setComputeStyleInfo

Turn on font identification model and return font style information.

Deprecated, use PremiumFeatures.compute_style_info instead.

Parameter
NameDescription
var bool
Returns
TypeDescription
$this

getDisableCharacterBoxesDetection

Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.

Returns
TypeDescription
bool

setDisableCharacterBoxesDetection

Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.

Parameter
NameDescription
var bool
Returns
TypeDescription
$this

getPremiumFeatures

Configurations for premium OCR features.

Returns
TypeDescription
Google\Cloud\DocumentAI\V1\OcrConfig\PremiumFeatures|null

hasPremiumFeatures

clearPremiumFeatures

setPremiumFeatures

Configurations for premium OCR features.

Parameter
NameDescription
var Google\Cloud\DocumentAI\V1\OcrConfig\PremiumFeatures
Returns
TypeDescription
$this