Text Detection

Text Detection performs Optical Character Recognition (OCR) to detect visible text from frames in a video, or video segments, and returns the detected text along with information about the frame-level location and timestamp in the video for that text.

Text Detection is particularly useful for media & entertainment use cases, including, detecting and extracting cast lists at the end of shows and movies, or detecting the presence of burnt-in subtitles.

Text detection is available for the languages supported by the Cloud Vision API.

To detect visible text from a video or video segments, call the annotate method and specify TEXT_DETECTION in the features field.

Check out the Video Intelligence API visualizer to see this feature in action.

For examples of requesting text detection and getting the annotated results, see Text Detection.