Audience
The goal of this tutorial is to help you develop applications using the Vision API Crop Hints feature. It assumes you are familiar with basic programming constructs and techniques. However, even if you are a beginning programmer, you should be able to follow along and run this tutorial without difficulty, then use the Vision API reference documentation to create basic applications.
This tutorial steps through a Vision API application, showing you how to make a call to the Vision API to use its Crop Hints feature.
Prerequisites
- Set up a Vision API project in the Google Cloud console.
Set up your environment for using Application Default Credentials.
Python
- Install Python.
- Install pip.
- Install the Google Cloud Client Library.
- Install the Python Imaging Library
Overview
This tutorial walks you through a basic Vision API application that uses a
Crop Hints
request. You can provide the image to be processed either through
a Cloud Storage URI (Cloud Storage bucket location) or embedded in the
request. A successful Crop Hints
response returns the coordinates for a
bounding box cropped around the dominant object or face in the image.
Code listing
As you read the code, we recommend that you follow along by referring to the Cloud Vision API Python reference.
A closer look
Importing libraries
We import standard libraries:
argparse
to allow the application to accept input filenames as argumentsio
for file I/O
Other imports:
- The
ImageAnnotatorClient
class within thegoogle.cloud.vision
library for accessing the Vision API. - The
types
module within thegoogle.cloud.vision
library for constructing requests - The
Image
andImageDraw
modules from thePython Imaging Library
(PIL). to draw a boundary box on the input image.
Running the application
Here, we simply parse the passed-in argument that specifies the local image filename, and pass it to a function to crop the image or draw the hint.
Authenticating to the API
Before communicating with the Vision API service, you must
authenticate your service using previously acquired credentials. Within an
application, the simplest way to obtain credentials is to use
Application Default Credentials
(ADC). By default, the client library will attempt to
obtain credentials from the GOOGLE_APPLICATION_CREDENTIALS
environment variable, which should be set to point to your service account's
JSON key file (see
Set Up a Service Account
for more information.)
Getting crop hint annotations for the image
Now that the Vision client library is authenticated, we can access the service
by calling the crop_hints
method of the ImageAnnotatorClient
instance.
The aspect ratio for the output is specified in an
ImageContext
object; if multiple aspect ratios are passed in then multiple
crop hints will be returned, one for each aspect ratio.
The client library encapsulates the details for requests and responses to the API. See the Vision API Reference for complete information on the structure of a request.
Using the response to crop or draw the hint's bounding box
Once the operation has been completed successfully, the API response will
contain the bounding box coordinates of one or more cropHint
s. The
draw_hint
method draws lines around the CropHints bounding box, then writes
the image to output-hint.jpg
.
The crop_to_hint
method crops the image using the suggested crop hint.
Running the application
To run the application, you can
download this cat.jpg
file
(you may need to right-click the link),
then pass the location where you downloaded the file on your local machine
to the tutorial application (crop_hints.py
).
Here is the Python command, followed by console output, which displays the
JSON cropHintsAnnotation
response. This response includes the coordinates of
the cropHints
bounding box. We requested a crop area with a 1.77
width-to-height aspect ratio, and the returned top-left, bottom-right
x,y coordinates of the crop rectangle are 0,336
, 1100,967
.
python crop_hints.py cat.jpeg crop
{ "responses": [ { "cropHintsAnnotation": { "cropHints": [ { "boundingPoly": { "vertices": [ { "y": 336 }, { "x": 1100, "y": 336 }, { "x": 1100, "y": 967 }, { "y": 967 } ] }, "confidence": 0.79999995, "importanceFraction": 0.69 } ] } } ] }
And here is the cropped image.
Congratulations! You've run the Cloud Vision Crop Hints API to return the optimized bounding box coordinates around the dominant object detected in the image!