请试用 Gemini 1.5 Pro（Vertex AI 中最先进的多模态模型），看看您可以通过包含 100 万个词元的上下文窗口构建什么。 请试用 Gemini 1.5 Pro（Vertex AI 中最先进的多模态模型），看看您可以通过包含 100 万个词元的上下文窗口构建什么。

检测文档中的文本：边界

返回文档中检测到的文本周围的方框边界。

深入探索

如需查看包含此代码示例的详细文档，请参阅以下内容：

密集文档文本检测教程

代码示例

Python

试用此示例之前，请按照《Vision 快速入门：使用客户端库》中的 Python 设置说明进行操作。如需了解详情，请参阅 Vision Python API 参考文档。

如需向 Vision 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

def get_document_bounds(image_file, feature):
    """Finds the document bounds given an image and feature type.

    Args:
        image_file: path to the image file.
        feature: feature type to detect.

    Returns:
        List of coordinates for the corresponding feature type.
    """
    client = vision.ImageAnnotatorClient()

    bounds = []

    with open(image_file, "rb") as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

    response = client.document_text_detection(image=image)
    document = response.full_text_annotation

    # Collect specified feature bounds by enumerating all document features
    for page in document.pages:
        for block in page.blocks:
            for paragraph in block.paragraphs:
                for word in paragraph.words:
                    for symbol in word.symbols:
                        if feature == FeatureType.SYMBOL:
                            bounds.append(symbol.bounding_box)

                    if feature == FeatureType.WORD:
                        bounds.append(word.bounding_box)

                if feature == FeatureType.PARA:
                    bounds.append(paragraph.bounding_box)

            if feature == FeatureType.BLOCK:
                bounds.append(block.bounding_box)

    # The list `bounds` contains the coordinates of the bounding boxes.
    return bounds

后续步骤

如需搜索和过滤其他 Google Cloud 产品的代码示例，请参阅 Google Cloud 示例浏览器。