Vision Warehouse overview

Vision Warehouse is an API that enables developers to integrate storage and AI-based search of unstructured media content (streaming video, images, and batch videos) into existing tools and applications.

Vision Warehouse is a major component of Vertex AI Vision. It serves as the storage repository and provides advanced search capabilities for multiple data types and use cases. Specifically:

  • Streaming video: You can import live video streams and live video analytics data using the Vertex AI Vision platform application or Vision Warehouse API, and search for images using the Vision Warehouse API or Google Cloud console.
  • Image: You can import image and metadata using Vision Warehouse API, analyze images using Vision Warehouse API, and search for images using the Vision Warehouse API or Google Cloud console.
  • Batch video: You can import batch video and metadata using the Vision Warehouse API, analyze batch video using Vision Warehouse API, and search for batch videos using the Vision Warehouse API or Google Cloud console.

API resources overview

API resource diagram

Storage API resources

Corpus: A container that holds media assets of a particular type. You can create multiple corpora to organize different types of media assets.

Asset: A media object stored within a corpus. Assets can be images, batch videos, or video streams. A corpus typically contains many assets of the same type. You can specify annotations associated with assets. Assets can also be grouped into collections for management.

Collection: A resource within a corpus that serves as a container of references to assets.

Annotation: User-supplied metadata or data derived from Vertex AI Vision that is associated with an asset. An asset can have multiple annotations.

  • Example 1: Specify a text annotation named "video-title" for batch video assets.
  • Example 2: Store analyzed data from Vertex AI Vision models as annotations. For example, object recognition labels in different video time frames can be stored as annotations.

Data schema: Defines how an annotation is interpreted within a corpus. A data schema defines one annotation type and its search strategy. Each annotation must be associated with a data schema.

Search API resources

Index (available to image and batch video verticals): A corpus-level resource that is a managed representation of analyzed assets and annotations. An index can be viewed as a dataset of embedding vectors and semantic restrictions that represents the meaning of the media content. Indexes can be deployed into index endpoints for search.

Index endpoint (available to image and batch video verticals): A managed environment that serves Vision Warehouse indexes. Index endpoints provide a single point of access for sending search requests.

Search Configuration: Stores various properties that affect search behaviors and search results.

  • Facet property (available to streaming video vertical): Creates a configuration to enable facet-based histogram search results.
  • Search criteria property (available to streaming video and batch video verticals): Creates a mapping between a custom search criteria and one or more data schema keys.

Search Hypernym: A specific type of search config that lets you customize the search service's ability to recognize hypernyms of words. For example, users can specify "animal" as a hypernym of "cat" and "dog". Searching for "animal" will also return results with "cat" and "dog" in the index data.

Supported languages

Batch Video Warehouse and Image Warehouse support the following languages for semantic search:

  • English
  • Spanish
  • Portuguese
  • French
  • Japanese
  • Chinese

Streaming Warehouse does not have language restriction.

What's next