Reference documentation and code samples for the Google Cloud Document Ai V1 Client class OcrConfig.
Config for Document OCR.
Generated from protobuf message google.cloud.documentai.v1.OcrConfig
Namespace
Google \ Cloud \ DocumentAI \ V1Methods
__construct
Constructor.
Parameters | |
---|---|
Name | Description |
data |
array
Optional. Data for populating the Message object. |
↳ hints |
Google\Cloud\DocumentAI\V1\OcrConfig\Hints
Hints for the OCR model. |
↳ enable_native_pdf_parsing |
bool
Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs. |
↳ enable_image_quality_scores |
bool
Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input. Adds additional latency comparable to regular OCR to the process call. |
↳ advanced_ocr_options |
array
A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are: - |
↳ enable_symbol |
bool
Includes symbol level OCR information if set to true. |
↳ compute_style_info |
bool
Turn on font identification model and return font style information. Deprecated, use PremiumFeatures.compute_style_info instead. |
↳ disable_character_boxes_detection |
bool
Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors. |
↳ premium_features |
Google\Cloud\DocumentAI\V1\OcrConfig\PremiumFeatures
Configurations for premium OCR features. |
getHints
Hints for the OCR model.
Returns | |
---|---|
Type | Description |
Google\Cloud\DocumentAI\V1\OcrConfig\Hints|null |
hasHints
clearHints
setHints
Hints for the OCR model.
Parameter | |
---|---|
Name | Description |
var |
Google\Cloud\DocumentAI\V1\OcrConfig\Hints
|
Returns | |
---|---|
Type | Description |
$this |
getEnableNativePdfParsing
Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.
Returns | |
---|---|
Type | Description |
bool |
setEnableNativePdfParsing
Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.
Parameter | |
---|---|
Name | Description |
var |
bool
|
Returns | |
---|---|
Type | Description |
$this |
getEnableImageQualityScores
Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input.
Adds additional latency comparable to regular OCR to the process call.
Returns | |
---|---|
Type | Description |
bool |
setEnableImageQualityScores
Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input.
Adds additional latency comparable to regular OCR to the process call.
Parameter | |
---|---|
Name | Description |
var |
bool
|
Returns | |
---|---|
Type | Description |
$this |
getAdvancedOcrOptions
A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:
legacy_layout
: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm.
Customers can choose the best suitable layout algorithm based on their situation.
Returns | |
---|---|
Type | Description |
Google\Protobuf\Internal\RepeatedField |
setAdvancedOcrOptions
A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:
legacy_layout
: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm.
Customers can choose the best suitable layout algorithm based on their situation.
Parameter | |
---|---|
Name | Description |
var |
string[]
|
Returns | |
---|---|
Type | Description |
$this |
getEnableSymbol
Includes symbol level OCR information if set to true.
Returns | |
---|---|
Type | Description |
bool |
setEnableSymbol
Includes symbol level OCR information if set to true.
Parameter | |
---|---|
Name | Description |
var |
bool
|
Returns | |
---|---|
Type | Description |
$this |
getComputeStyleInfo
Turn on font identification model and return font style information.
Deprecated, use PremiumFeatures.compute_style_info instead.
Returns | |
---|---|
Type | Description |
bool |
setComputeStyleInfo
Turn on font identification model and return font style information.
Deprecated, use PremiumFeatures.compute_style_info instead.
Parameter | |
---|---|
Name | Description |
var |
bool
|
Returns | |
---|---|
Type | Description |
$this |
getDisableCharacterBoxesDetection
Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.
Returns | |
---|---|
Type | Description |
bool |
setDisableCharacterBoxesDetection
Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.
Parameter | |
---|---|
Name | Description |
var |
bool
|
Returns | |
---|---|
Type | Description |
$this |
getPremiumFeatures
Configurations for premium OCR features.
Returns | |
---|---|
Type | Description |
Google\Cloud\DocumentAI\V1\OcrConfig\PremiumFeatures|null |
hasPremiumFeatures
clearPremiumFeatures
setPremiumFeatures
Configurations for premium OCR features.
Parameter | |
---|---|
Name | Description |
var |
Google\Cloud\DocumentAI\V1\OcrConfig\PremiumFeatures
|
Returns | |
---|---|
Type | Description |
$this |