Web Detection Tutorial

Audience

The goal of this tutorial is help you develop applications using the Google Cloud Vision API Web Detection feature. It assumes you are familiar with basic programming constructs and techniques, but even if you are a beginning programmer, you should be able to follow along and run this tutorial without difficulty, then use the Cloud Vision API reference documentation to create basic applications.

This tutorial steps through a Vision API application, showing you ow how to make a call to the Vision API to use its Web Detection feature.

Prerequisites

Python

Overview

This tutorial walks you through a basic Vision API application that uses a Web Detection request. A WEB Detection response annotates the image sent in the request with:

  • labels obtained from the Web
  • site URLs that have matching images
  • URLs to Web images that partially or fully match the image in the request

Code listing

As you read the code, we recommend that you follow along by referring to the Cloud Vision API Python reference.

import argparse
import base64
import json

from googleapiclient import discovery
from oauth2client.client import GoogleCredentials

def checkAndPrintKey(keyName, urlResponse):
     print keyName
     if keyName not in urlResponse:
         print "no key found:" + keyName
     else:
        matchingPages = urlResponse[keyName]
        for page in matchingPages:
            print page['url']  + ":"  + str(page['score'])
     return

def main(image_url):
    """Run a request on a single image"""

    credentials = GoogleCredentials.get_application_default()
    service = discovery.build('vision', 'v1', credentials=credentials)

    service_request = service.images().annotate(body={
        'requests':[{
            'image':{
                'source':{
                    'imageUri':image_url
                }
            },
            'features':[
                {
                    'type':'WEB_DETECTION'
                }]
        }]
    })

    # Print a list of Web entities with description and score.
    # Print a list of matching pages.
    # print a list of partially-matching images.
    # print a list of fully-matching images.
    apiresponse = service_request.execute()
    data = json.dumps(apiresponse)
    urlresponse = json.loads(data)
    for key, value in urlresponse.items():
        responses = urlresponse[key]
        for response in responses:
            if 'webDetection' not in response:
                print "nothing found"
                return
            webDetect = response['webDetection']
            if 'webEntities' not in webDetect:
                 print "no entity found"
                 return
            webEntities = webDetect['webEntities']
            for entity in webEntities:
                print entity['description']+ ":"  + str(entity['score'])
            checkAndPrintKey('pagesWithMatchingImages', webDetect)
            checkAndPrintKey('partialMatchingImages', webDetect)
            checkAndPrintKey('fullMatchingImages', webDetect)

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('image_url', help='The Web URL of the image to detect.')
    args = parser.parse_args()
    main(args.image_url)

This simple application performs the following tasks:

  • Imports the libraries necessary to run the application
  • Takes a Web image URL as an argument and passes it to the main() function
  • Gets credentials to run the Cloud Vision API service
  • Creates a Cloud Vision annotate image request to send to the service
  • Sends the request and returns a response
  • Loops over the response and prints out the results
  • Parses the response for the service and displays it to the user
  • Prints list of Web entities with description and score
  • Prints a list of matching pages
  • Prints a list of partially-matching images
  • Prints a list of fully-matching images

A closer look

Importing libraries

import argparse
import base64
import json

from googleapiclient import discovery
from oauth2client.client import GoogleCredentials

We import standard libraries:

  • argparse to allow the application to accept input filenames as arguments
  • base64 to encode the image data as JSON text
  • json to format the response

Other imports:

  • The discovery module within the googleapiclient library holds the directory of our API calls.
  • The GoogleCredentials module within the oauth2client.client library handles authentication to the service.

Running the application

def main(image_url):
  '''Run web detection request on a single image'''
  ...
if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('image_url', help='The Web URL of the image to detect.')
    args = parser.parse_args()
    main(args.image_url)

Here, we simply parse the passed-in argument that specifies the URL of the web image, and pass it to the main() function.

Authenticating to the API

    credentials = GoogleCredentials.get_application_default()
    service = discovery.build('vision', 'v1', credentials=credentials)

Before communicating with the Vision API service, you must authenticate your service using previously acquired credentials. Within an application, the simplest way to obtain credentials is to use Application Default Credentials (ADC). We obtain the Application Default Credentials using the get_application_default() method. By default, this method will attempt to obtain credentials from the GOOGLE_APPLICATION_CREDENTIALS environment variable, which should be set to point to your service account's JSON key file (see Set Up a Service Account for more information.)

We then build the API for our service by calling the discovery module, which builds the Vision API, providing us with its annotate() method.

Constructing the request

  service_request = service.images().annotate(body={
        'requests':[{
            'image':{
                'source':{
                    'imageUri':image_url
                }
            },
            'features':[
                {
                    'type':'WEB_DETECTION'
                }]
        }]
    })

Now that our Vision API service is ready, we can construct a request to the service. Requests to the Google Cloud Vision API are provided as JSON objects. See the Vision API Reference for complete information on the structure of a request.

This code snippet performs the following tasks:

  1. Constructs the JSON for a POST request to the images().annotate() method.
  2. Injects the Web URL of the image that we want to send to the service.
  3. Indicates that our annotate method should perform WEB_DETECTION.

Parsing the response

        data = json.dumps(apiresponse)
        urlresponse = json.loads(data)
        for key, value in urlresponse.items():
            responses = urlresponse[key]
            for response in responses:
                if 'webDetection' not in response:
                    print "nothing found"
                    return
                webDetect = response['webDetection']
                if 'webEntities' not in webDetect:
                     print "no entity found"
                     return
                webEntities = webDetect['webEntities']
                for entity in webEntities:
                    print entity['description']+ ":"  + str(entity['score'])
                checkAndPrintKey('pagesWithMatchingImages', webDetect)
                checkAndPrintKey('partialMatchingImages', webDetect)
                checkAndPrintKey('fullMatchingImages', webDetect)

Once the operation has been completed, our response will contain an AnnotateImageResponse, which consists of a list of Image Annotation results, one for each image sent in the request. Because we sent only one image in the request, we walk through the WebDetection, and print the entities and URLs contained in the annotation (the top two results from each annotation type are shown in the next section).

Running the application

To run the application, we pass in the Web URL (http://wallppr.net/wp-content/uploads/2016/10/Car-4K-Wallpaper-10.jpeg) of the following car image.

Here is the Python command with the passed-in Web URL of the car image, followed by console output. Note that a relevancy score is added (for example, :1.1068367) after the listed entities and URLs. Note that scores are not normalized or comparable across different image queries.

python web_detect.py "http://wallppr.net/wp-content/uploads/2016/10/Car-4K-Wallpaper-10.jpeg"
Car:1.47328
Ferrari F430:1.1068367
Ferrari S.p.A.:0.75906
Auto racing:0.73261
Sports car:0.5722
pagesWithMatchingImages
https://www.pexels.com/search/race%20car/:7.068652
https://www.pexels.com/photo/road-blue-car-vehicle-50704/:3.7592764
...
partialMatchingImages
https://i1.wp.com/representeveryone.com/wp-content/uploads/2016/05/car_new.jpg?fit=5016%2C3344&ssl=1:1
http://wallppr.net/wp-content/uploads/2016/10/Car-4K-Wallpaper-10.jpeg:1
...
fullMatchingImages
https://i1.wp.com/representeveryone.com/wp-content/uploads/2016/05/car_new.jpg?fit=5016%2C3344&ssl=1:1
http://wallppr.net/wp-content/uploads/2016/10/Car-4K-Wallpaper-10.jpeg:1
...

Congratulations! You've performed Web Detection using the Google Cloud Vision API!

Send feedback about...

Google Cloud Vision API Documentation