Console

Text Detection

Text Detection performs Optical Character Recognition (OCR) to detect visible text from frames in a video, or video segments, and returns the detected text along with information about the frame-level location and timestamp in the video for that text.

Text Detection is particularly useful for media & entertainment use cases, including, detecting and extracting cast lists at the end of shows and movies, or detecting the presence of burnt-in subtitles.

Text detection is available for the languages supported by the Cloud Vision API.

To detect visible text from a video or video segments, call the annotate method and specify TEXT_DETECTION in the features field.

Check out the Video Intelligence API visualizer to see this feature in action.

For examples of requesting text detection and getting the annotated results, see Text Detection.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-08-01 UTC.

Text Detection Stay organized with collections Save and categorize content based on your preferences.

Text Detection