Google Cloud

Building an image search application using Cloud Vision API

Cloud Vision API enables developers to incorporate powerful image content analysis features into new and existing applications. In this new solution, we demonstrate how Cloud Vision API's label detection feature can be used in conjunction with App Engine, enabling features such as keyword search and faceted navigation on images.


Using label detection, Cloud Vision API can broadly classify images into thousands of categories. Many developers often wonder how to associate myriad labels describing an image with known or predetermined categories expected by users or downstream applications.  Consider the hypothetical example of a stock photography website with an image search function; if Vision API thinks an image of a beach is related to "Coastal and Oceanic Landforms," "Sea," and "Tropics," you may also want a particular image to appear in search results when users search for "Seaside Destinations," or perhaps allow users to specify "Seaside Destinations" as a new image category.


Associating labels with higher-level groupings is also useful in non-search scenarios as well.  For example, you may want to trigger content workflows and other automated processing when specific labels are returned by Vision API.

This solution illustrates different approaches to solving this challenge, from mapping Vision API labels to specific categories, to more sophisticated approaches combining Vision API's broad understanding of images with natural language processing techniques (using word vectors from GloVe) to match images to categories.

To get started, check out the concept and then proceed onto the tutorial and sample code. Aside from label detection, Cloud Vision API provides a wide range of capabilities that can be applied to image content analytics, including text extraction, landmark detection, image attributes, and explicit content.

If you come up with an interesting application of Cloud Vision API, we'd love to hear about it!