Detect text in a document: Bounds

Returns the bounds for the boxes around the text detected in a document.

Explore further

For detailed documentation that includes this code sample, see the following:

Code sample


Before trying this sample, follow the Python setup instructions in the Vision quickstart using client libraries. For more information, see the Vision Python API reference documentation.

To authenticate to Vision, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

def get_document_bounds(image_file, feature):
    """Finds the document bounds given an image and feature type.

        image_file: path to the image file.
        feature: feature type to detect.

        List of coordinates for the corresponding feature type.
    client = vision.ImageAnnotatorClient()

    bounds = []

    with open(image_file, "rb") as image_file:
        content =

    image = vision.Image(content=content)

    response = client.document_text_detection(image=image)
    document = response.full_text_annotation

    # Collect specified feature bounds by enumerating all document features
    for page in document.pages:
        for block in page.blocks:
            for paragraph in block.paragraphs:
                for word in paragraph.words:
                    for symbol in word.symbols:
                        if feature == FeatureType.SYMBOL:

                    if feature == FeatureType.WORD:

                if feature == FeatureType.PARA:

            if feature == FeatureType.BLOCK:

    # The list `bounds` contains the coordinates of the bounding boxes.
    return bounds

What's next

To search and filter code samples for other Google Cloud products, see the Google Cloud sample browser.