Using Python to send requests

This example uses a Python script to construct a Cloud Vision AnnotateImageRequest. It then uses Python to send the request to the Vision API.

import argparse
import base64
import json
import sys


def main(input_file, output_filename):
    """Translates the input file into a json output file.

    Args:
        input_file: a file object, containing lines of input to convert.
        output_filename: the name of the file to output the json to.
    """
    request_list = []
    for line in input_file:
        image_filename, features = line.lstrip().split(' ', 1)

        with open(image_filename, 'rb') as image_file:
            content_json_obj = {
                'content': base64.b64encode(image_file.read()).decode('UTF-8')
            }

        feature_json_obj = []
        for word in features.split(' '):
            feature, max_results = word.split(':', 1)
            feature_json_obj.append({
                'type': get_detection_type(feature),
                'maxResults': int(max_results),
            })

        request_list.append({
            'features': feature_json_obj,
            'image': content_json_obj,
        })

    with open(output_filename, 'w') as output_file:
        json.dump({'requests': request_list}, output_file)


DETECTION_TYPES = [
    'TYPE_UNSPECIFIED',
    'FACE_DETECTION',
    'LANDMARK_DETECTION',
    'LOGO_DETECTION',
    'LABEL_DETECTION',
    'TEXT_DETECTION',
    'SAFE_SEARCH_DETECTION',
]


def get_detection_type(detect_num):
    """Return the Vision API symbol corresponding to the given number."""
    detect_num = int(detect_num)
    if 0 < detect_num < len(DETECTION_TYPES):
        return DETECTION_TYPES[detect_num]
    else:
        return DETECTION_TYPES[0]

The Python script reads an input text file that specifies the feature detection to perform on a set of images. Each line in the input file contains the path to an image and a feature specifier of the form feature:max_results for each image. Features are mapped using an integer value from 1-6:

1: FACE_DETECTION
2: LANDMARK_DETECTION
3: LOGO_DETECTION
4: LABEL_DETECTION
5: TEXT_DETECTION
6: SAFE_SEARCH_DETECTION

For example, the following input file to the script requests face and label detection annotations for image1, and landmark and logo detection annotations for image2; each with a maximum of 10 results per annotation.

filepath_to_image1.jpg 1:10 4:10
filepath_to_image2.png 2:10 3:10

You can download the script and run it, specifying the input and output files as follows:

python generatejson.py -i inputfile -o outputfile

This example requests Cloud Vision annotations for the following image:

The input file to the script, visioninfile.txt, contains the image file URI, and lists three features annotations: logo (3), label (4), and text (5) annotations, with a maximum of 10 responses for each annotation:

/Users/username/testdata/google.jpg 3:10 4:10 5:10

The script is executed with input and output file arguments. The output is written to vision.json. Note: to aid readability, line continuation "\" characters are used below to split the command into several lines; when executing the command, the command and all arguments should be supplied on a single command line.

python generatejson.py \
    -i /Users/username/testdata/visioninfile.txt \
    -o /Users/username/testdata/vision.json

The script processes the input file and displays a summary of the requested feature annotations on the console,

detect:LOGO_DETECTION
results: 10
detect:LABEL_DETECTION
results: 10
detect:TEXT_DETECTION
results: 10

and writes the assembled JSON request to vision.json (the content is truncated and formatted, below):

{
  "requests": [
    {
      "image": {
        "content": "/9j/4...A//9k="
       },
       "features": [
          {
            "type": "LOGO_DETECTION",
            "maxResults": "10"
          },
          {
            "type": "LABEL_DETECTION",
             "maxResults": "10"
          },
          {
            "type": "TEXT_DETECTION",
            "maxResults": "10"
          }
       ]
     }
  ]
}

Python is used to send the request to Cloud Vision and display the response. Note that multiple label annotations in the response are sorted in order of highest-to-lowest confidence score.

python
...
>>> import requests
>>> data = open('/Users/<username>/testdata/vision.json', 'rb').read()
>>> response = requests.post(url='https://vision.googleapis.com/v1/images:annotate?key=<API-key>',
    data=data,
    headers={'Content-Type': 'application/json'})
>>> print response.text
{
  "responses": [
    {
      "logoAnnotations": [
        {
          "mid": "/m/045c7b",
          "description": "Google",
          "score": 0.35000956,
          "boundingPoly": {
            "vertices": [
              {
                "x": 158,
                "y": 50
              },
              {
                "x": 515,
                "y": 50
              },
              {
                "x": 515,
                "y": 156
              },
              {
                "x": 158,
                "y": 156
              }
            ]
          }
        }
      ],
      "labelAnnotations": [
        {
          "mid": "/m/021sdg",
          "description": "graphics",
          "score": 0.67143095
        },
        {
          "mid": "/m/0dgsmq8",
          "description": "artwork",
          "score": 0.66358012
        },
        {
          "mid": "/m/0dwx7",
          "description": "logo",
          "score": 0.31318793
        },
        {
          "mid": "/m/01mf0",
          "description": "software",
          "score": 0.23124418
        },
        {
          "mid": "/m/03g09t",
          "description": "clip art",
          "score": 0.20368107
        },
        {
          "mid": "/m/02ngh",
          "description": "emoticon",
          "score": 0.19831011
        },
        {
          "mid": "/m/0h8npc5",
          "description": "digital content software",
          "score": 0.1769385
        },
        {
          "mid": "/m/03tqj",
          "description": "icon",
          "score": 0.097528793
        },
        {
          "mid": "/m/0hr95w1",
          "description": "pointer",
          "score": 0.03663468
        },
        {
          "mid": "/m/0n0j",
          "description": "area",
          "score": 0.033584446
        }
      ],
      "textAnnotations": [
        {
          "locale": "en",
          "description": "Google\n",
          "boundingPoly": {
            "vertices": [
              {
                "x": 61,
                "y": 26
              },
              {
                "x": 598,
                "y": 26        },

              },
              {
                "x": 598,
                "y": 227
              },
              {
                "x": 61,
                "y": 227
              }
            ]
          }
        }
      ]
    }
  ]
}
>>>

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Google Cloud Vision API Documentation