Interface OcrConfigOrBuilder (2.35.0)

public interface OcrConfigOrBuilder extends MessageOrBuilder

Implements

MessageOrBuilder

Methods

getAdvancedOcrOptions(int index)

public abstract String getAdvancedOcrOptions(int index)

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

  • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.

repeated string advanced_ocr_options = 5;

Parameter
NameDescription
indexint

The index of the element to return.

Returns
TypeDescription
String

The advancedOcrOptions at the given index.

getAdvancedOcrOptionsBytes(int index)

public abstract ByteString getAdvancedOcrOptionsBytes(int index)

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

  • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.

repeated string advanced_ocr_options = 5;

Parameter
NameDescription
indexint

The index of the value to return.

Returns
TypeDescription
ByteString

The bytes of the advancedOcrOptions at the given index.

getAdvancedOcrOptionsCount()

public abstract int getAdvancedOcrOptionsCount()

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

  • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.

repeated string advanced_ocr_options = 5;

Returns
TypeDescription
int

The count of advancedOcrOptions.

getAdvancedOcrOptionsList()

public abstract List<String> getAdvancedOcrOptionsList()

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

  • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.

repeated string advanced_ocr_options = 5;

Returns
TypeDescription
List<String>

A list containing the advancedOcrOptions.

getComputeStyleInfo() (deprecated)

public abstract boolean getComputeStyleInfo()

Deprecated. google.cloud.documentai.v1beta3.OcrConfig.compute_style_info is deprecated. See google/cloud/documentai/v1beta3/document_io.proto;l=165

Turn on font identification model and return font style information. Deprecated, use PremiumFeatures.compute_style_info instead.

bool compute_style_info = 8 [deprecated = true];

Returns
TypeDescription
boolean

The computeStyleInfo.

getDisableCharacterBoxesDetection()

public abstract boolean getDisableCharacterBoxesDetection()

Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.

bool disable_character_boxes_detection = 10;

Returns
TypeDescription
boolean

The disableCharacterBoxesDetection.

getEnableImageQualityScores()

public abstract boolean getEnableImageQualityScores()

Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input. Adds additional latency comparable to regular OCR to the process call.

bool enable_image_quality_scores = 4;

Returns
TypeDescription
boolean

The enableImageQualityScores.

getEnableNativePdfParsing()

public abstract boolean getEnableNativePdfParsing()

Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.

bool enable_native_pdf_parsing = 3;

Returns
TypeDescription
boolean

The enableNativePdfParsing.

getEnableSymbol()

public abstract boolean getEnableSymbol()

Includes symbol level OCR information if set to true.

bool enable_symbol = 6;

Returns
TypeDescription
boolean

The enableSymbol.

getHints()

public abstract OcrConfig.Hints getHints()

Hints for the OCR model.

.google.cloud.documentai.v1beta3.OcrConfig.Hints hints = 2;

Returns
TypeDescription
OcrConfig.Hints

The hints.

getHintsOrBuilder()

public abstract OcrConfig.HintsOrBuilder getHintsOrBuilder()

Hints for the OCR model.

.google.cloud.documentai.v1beta3.OcrConfig.Hints hints = 2;

Returns
TypeDescription
OcrConfig.HintsOrBuilder

getPremiumFeatures()

public abstract OcrConfig.PremiumFeatures getPremiumFeatures()

Configurations for premium OCR features.

.google.cloud.documentai.v1beta3.OcrConfig.PremiumFeatures premium_features = 11;

Returns
TypeDescription
OcrConfig.PremiumFeatures

The premiumFeatures.

getPremiumFeaturesOrBuilder()

public abstract OcrConfig.PremiumFeaturesOrBuilder getPremiumFeaturesOrBuilder()

Configurations for premium OCR features.

.google.cloud.documentai.v1beta3.OcrConfig.PremiumFeatures premium_features = 11;

Returns
TypeDescription
OcrConfig.PremiumFeaturesOrBuilder

hasHints()

public abstract boolean hasHints()

Hints for the OCR model.

.google.cloud.documentai.v1beta3.OcrConfig.Hints hints = 2;

Returns
TypeDescription
boolean

Whether the hints field is set.

hasPremiumFeatures()

public abstract boolean hasPremiumFeatures()

Configurations for premium OCR features.

.google.cloud.documentai.v1beta3.OcrConfig.PremiumFeatures premium_features = 11;

Returns
TypeDescription
boolean

Whether the premiumFeatures field is set.