Detect multiple objects

The Cloud Vision API can detect and extract multiple objects in an image with Object Localization.

Object localization identifies multiple objects in an image and provides a LocalizedObjectAnnotation for each object in the image. Each LocalizedObjectAnnotation identifies information about the object, the position of the object, and rectangular bounds for the region of the image that contains the object.

Object localization identifies both significant and less-prominent objects in an image.

Object information is returned in English only. The Cloud Translation can translate English labels into any of a number of other languages.

image with bounding boxes
Image credit: Bogdan Dada, Unsplash (annotations added).

For example, the API might return the following information and bounding location data for the objects in the image above:

Name mid Score Bounds
Bicycle wheel /m/01bqk0 0.89648587 (0.32076266, 0.78941387), (0.43812272, 0.78941387), (0.43812272, 0.97331065), (0.32076266, 0.97331065)
Bicycle /m/0199g 0.886761 (0.312, 0.6616471), (0.638353, 0.6616471), (0.638353, 0.9705882), (0.312, 0.9705882)
Bicycle wheel /m/01bqk0 0.6345275 (0.5125398, 0.760708), (0.6256646, 0.760708), (0.6256646, 0.94601655), (0.5125398, 0.94601655)
Picture frame /m/06z37_ 0.6207608 (0.79177403, 0.16160682), (0.97047985, 0.16160682), (0.97047985, 0.31348917), (0.79177403, 0.31348917)
Tire /m/0h9mv 0.55886006 (0.32076266, 0.78941387), (0.43812272, 0.78941387), (0.43812272, 0.97331065), (0.32076266, 0.97331065)
Door /m/02dgv 0.5160098 (0.77569866, 0.37104446), (0.9412425, 0.37104446), (0.9412425, 0.81507325), (0.77569866, 0.81507325)

mid contains a machine-generated identifier (MID) corresponding to a label's Google Knowledge Graph entry. For information on inspecting mid values, see the Google Knowledge Graph Search API documentation.

Object Localization requests

Set up your GCP project and authentication

Detect objects in a local image

The Vision API can perform feature detection on a local image file by sending the contents of the image file as a base64 encoded string in the body of your request.

Command-line

To make an object detection and localization request, make a POST request to the following endpoint:

POST https://vision.googleapis.com/v1/images:annotate

To make an object detection request using curl from the Linux or MacOS command line, make a POST request to the https://vision.googleapis.com/v1/images:annotate endpoint and specify OBJECT_LOCALIZATION as the value of features.type, as shown in the following example:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
https://vision.googleapis.com/v1/images:annotate -d "{
  'requests': [
    {
      'image': {
        'content': '/9j/7QBEUGhvdG9zaG9...base64-encoded-image-content...fXNWzvDEeYxxxzj/Coa6Bax//Z'
      },
      'features': [
        {
          'maxResults': 10,
          'type': 'OBJECT_LOCALIZATION'
        },
      ]
    }
  ]
}"

See the AnnotateImageRequest reference documentation for more information on configuring the request body.

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format.

C#

Before trying this sample, follow the C# setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API C# API reference documentation .

            var client = ImageAnnotatorClient.Create();
            var response = client.DetectLocalizedObjects(image);

            Console.WriteLine($"Number of objects found {response.Count}");
            foreach (var localizedObject in response)
            {
                Console.Write($"\n{localizedObject.Name}");
                Console.WriteLine($" (confidence: {localizedObject.Score})");
                Console.WriteLine("Normalized bounding polygon vertices: ");

                foreach (var vertex
                        in localizedObject.BoundingPoly.NormalizedVertices)
                {
                    Console.WriteLine($" - ({vertex.X}, {vertex.Y})");
                }
            }

Go

Before trying this sample, follow the Go setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Go API reference documentation .


// localizeObjects gets objects and bounding boxes from the Vision API for an image at the given file path.
func localizeObjects(w io.Writer, file string) error {
	ctx := context.Background()

	client, err := vision.NewImageAnnotatorClient(ctx)
	if err != nil {
		return err
	}

	f, err := os.Open(file)
	if err != nil {
		return err
	}
	defer f.Close()

	image, err := vision.NewImageFromReader(f)
	if err != nil {
		return err
	}
	annotations, err := client.LocalizeObjects(ctx, image, nil)
	if err != nil {
		return err
	}

	if len(annotations) == 0 {
		fmt.Fprintln(w, "No objects found.")
		return nil
	}

	fmt.Fprintln(w, "Objects:")
	for _, annotation := range annotations {
		fmt.Fprintln(w, annotation.Name)
		fmt.Fprintln(w, annotation.Score)

		for _, v := range annotation.BoundingPoly.NormalizedVertices {
			fmt.Fprintf(w, "(%f,%f)\n", v.X, v.Y)
		}
	}

	return nil
}

Java

Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries. For more information, see the Vision API Java API reference documentation.

/**
 * Detects localized objects in the specified local image.
 *
 * @param filePath The path to the file to perform localized object detection on.
 * @param out A {@link PrintStream} to write detected objects to.
 * @throws Exception on errors while closing the client.
 * @throws IOException on Input/Output errors.
 */
public static void detectLocalizedObjects(String filePath, PrintStream out)
    throws Exception, IOException {
  List<AnnotateImageRequest> requests = new ArrayList<>();

  ByteString imgBytes = ByteString.readFrom(new FileInputStream(filePath));

  Image img = Image.newBuilder().setContent(imgBytes).build();
  AnnotateImageRequest request =
      AnnotateImageRequest.newBuilder()
          .addFeatures(Feature.newBuilder().setType(Type.OBJECT_LOCALIZATION))
          .setImage(img)
          .build();
  requests.add(request);

  // Perform the request
  try (ImageAnnotatorClient client = ImageAnnotatorClient.create()) {
    BatchAnnotateImagesResponse response = client.batchAnnotateImages(requests);
    List<AnnotateImageResponse> responses = response.getResponsesList();

    // Display the results
    for (AnnotateImageResponse res : responses) {
      for (LocalizedObjectAnnotation entity : res.getLocalizedObjectAnnotationsList()) {
        out.format("Object name: %s\n", entity.getName());
        out.format("Confidence: %s\n", entity.getScore());
        out.format("Normalized Vertices:\n");
        entity
            .getBoundingPoly()
            .getNormalizedVerticesList()
            .forEach(vertex -> out.format("- (%s, %s)\n", vertex.getX(), vertex.getY()));
      }
    }
  }
}

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Node.js API reference documentation .

// Imports the Google Cloud client libraries
const vision = require('@google-cloud/vision');
const fs = require('fs');

// Creates a client
const client = new vision.ImageAnnotatorClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const fileName = `/path/to/localImage.png`;
const request = {
  image: {content: fs.readFileSync(fileName)},
};

const [result] = await client.objectLocalization(request);
const objects = result.localizedObjectAnnotations;
objects.forEach(object => {
  console.log(`Name: ${object.name}`);
  console.log(`Confidence: ${object.score}`);
  const vertices = object.boundingPoly.normalizedVertices;
  vertices.forEach(v => console.log(`x: ${v.x}, y:${v.y}`));
});

PHP

Before trying this sample, follow the PHP setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API PHP API reference documentation .

namespace Google\Cloud\Samples\Vision;

use Google\Cloud\Vision\V1\ImageAnnotatorClient;

// $path = 'path/to/your/image.jpg'

function detect_object($path)
{
    $imageAnnotator = new ImageAnnotatorClient();

    # annotate the image
    $image = file_get_contents($path);
    $response = $imageAnnotator->objectLocalization($image);
    $objects = $response->getLocalizedObjectAnnotations();

    foreach ($objects as $object) {
        $name = $object->getName();
        $score = $object->getScore();
        $vertices = $object->getBoundingPoly()->getNormalizedVertices();

        printf('%s (confidence %d)):' . PHP_EOL, $name, $score);
        print('normalized bounding polygon vertices: ');
        foreach ($vertices as $vertex) {
            printf(' (%d, %d)', $vertex->getX(), $vertex->getY());
        }
        print(PHP_EOL);
    }

    $imageAnnotator->close();
}

Python

Before trying this sample, follow the Python setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Python API reference documentation .

def localize_objects(path):
    """Localize objects in the local image.

    Args:
    path: The path to the local file.
    """
    from google.cloud import vision
    client = vision.ImageAnnotatorClient()

    with open(path, 'rb') as image_file:
        content = image_file.read()
    image = vision.types.Image(content=content)

    objects = client.object_localization(
        image=image).localized_object_annotations

    print('Number of objects found: {}'.format(len(objects)))
    for object_ in objects:
        print('\n{} (confidence: {})'.format(object_.name, object_.score))
        print('Normalized bounding polygon vertices: ')
        for vertex in object_.bounding_poly.normalized_vertices:
            print(' - ({}, {})'.format(vertex.x, vertex.y))

Ruby

Before trying this sample, follow the Ruby setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Ruby API reference documentation .

# image_path = "Path to local image file, eg. './image.png'"

require "google/cloud/vision"

image_annotator = Google::Cloud::Vision::ImageAnnotator.new

response = image_annotator.object_localization_detection image: image_path

response.responses.each do |res|
  res.localized_object_annotations.each do |object|
    puts "#{object.name} (confidence: #{object.score})"
    puts "Normalized bounding polygon vertices:"
    object.bounding_poly.normalized_vertices.each do |vertex|
      puts " - (#{vertex.x}, #{vertex.y})"
    end
  end
end

Detect objects in a remote image

For your convenience, the Vision API can perform feature detection directly on an image file located in Google Cloud Storage or on the Web without the need to send the contents of the image file in the body of your request.

curl command

To make an object detection and localization request, make a POST request to the following endpoint:

POST https://vision.googleapis.com/v1/images:annotate

To make an object detection and localization request using curl from the Linux or MacOS command line, make a POST request to the https://vision.googleapis.com/v1/images:annotate endpoint and specify OBJECT_LOCALIZATION as the value of features.type, as shown in the following example:

curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
https://vision.googleapis.com/v1/images:annotate -d "{
  'requests': [
    {
      'image': {
        'source': {
          'gcsImageUri': 'https://cloud.google.com/vision/docs/images/bicycle_example.png'
        }
      },
      'features': [
        {
          'maxResults': 10,
          'type': 'OBJECT_LOCALIZATION'
        },
      ]
    }
  ]
}"

See the AnnotateImageRequest reference documentation for more information on configuring the request body.

If the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format:

C#

Before trying this sample, follow the C# setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API C# API reference documentation .

            var client = ImageAnnotatorClient.Create();
            var response = client.DetectLocalizedObjects(image);

            Console.WriteLine($"Number of objects found {response.Count}");
            foreach (var localizedObject in response)
            {
                Console.Write($"\n{localizedObject.Name}");
                Console.WriteLine($" (confidence: {localizedObject.Score})");
                Console.WriteLine("Normalized bounding polygon vertices: ");

                foreach (var vertex
                        in localizedObject.BoundingPoly.NormalizedVertices)
                {
                    Console.WriteLine($" - ({vertex.X}, {vertex.Y})");
                }
            }

Go

Before trying this sample, follow the Go setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Go API reference documentation .


// localizeObjects gets objects and bounding boxes from the Vision API for an image at the given file path.
func localizeObjectsURI(w io.Writer, file string) error {
	ctx := context.Background()

	client, err := vision.NewImageAnnotatorClient(ctx)
	if err != nil {
		return err
	}

	image := vision.NewImageFromURI(file)
	annotations, err := client.LocalizeObjects(ctx, image, nil)
	if err != nil {
		return err
	}

	if len(annotations) == 0 {
		fmt.Fprintln(w, "No objects found.")
		return nil
	}

	fmt.Fprintln(w, "Objects:")
	for _, annotation := range annotations {
		fmt.Fprintln(w, annotation.Name)
		fmt.Fprintln(w, annotation.Score)

		for _, v := range annotation.BoundingPoly.NormalizedVertices {
			fmt.Fprintf(w, "(%f,%f)\n", v.X, v.Y)
		}
	}

	return nil
}

Java

Before trying this sample, follow the Java setup instructions in the Vision API Quickstart Using Client Libraries. For more information, see the Vision API Java API reference documentation.

/**
 * Detects localized objects in a remote image on Google Cloud Storage.
 *
 * @param gcsPath The path to the remote file on Google Cloud Storage to detect localized objects
 *     on.
 * @param out A {@link PrintStream} to write detected objects to.
 * @throws Exception on errors while closing the client.
 * @throws IOException on Input/Output errors.
 */
public static void detectLocalizedObjectsGcs(String gcsPath, PrintStream out)
    throws Exception, IOException {
  List<AnnotateImageRequest> requests = new ArrayList<>();

  ImageSource imgSource = ImageSource.newBuilder().setGcsImageUri(gcsPath).build();
  Image img = Image.newBuilder().setSource(imgSource).build();

  AnnotateImageRequest request =
      AnnotateImageRequest.newBuilder()
          .addFeatures(Feature.newBuilder().setType(Type.OBJECT_LOCALIZATION))
          .setImage(img)
          .build();
  requests.add(request);

  // Perform the request
  try (ImageAnnotatorClient client = ImageAnnotatorClient.create()) {
    BatchAnnotateImagesResponse response = client.batchAnnotateImages(requests);
    List<AnnotateImageResponse> responses = response.getResponsesList();
    client.close();
    // Display the results
    for (AnnotateImageResponse res : responses) {
      for (LocalizedObjectAnnotation entity : res.getLocalizedObjectAnnotationsList()) {
        out.format("Object name: %s\n", entity.getName());
        out.format("Confidence: %s\n", entity.getScore());
        out.format("Normalized Vertices:\n");
        entity
            .getBoundingPoly()
            .getNormalizedVerticesList()
            .forEach(vertex -> out.format("- (%s, %s)\n", vertex.getX(), vertex.getY()));
      }
    }
  }
}

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Node.js API reference documentation .

// Imports the Google Cloud client libraries
const vision = require('@google-cloud/vision');

// Creates a client
const client = new vision.ImageAnnotatorClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const gcsUri = `gs://bucket/bucketImage.png`;

const [result] = await client.objectLocalization(gcsUri);
const objects = result.localizedObjectAnnotations;
objects.forEach(object => {
  console.log(`Name: ${object.name}`);
  console.log(`Confidence: ${object.score}`);
  const veritices = object.boundingPoly.normalizedVertices;
  veritices.forEach(v => console.log(`x: ${v.x}, y:${v.y}`));
});

PHP

Before trying this sample, follow the PHP setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API PHP API reference documentation .

namespace Google\Cloud\Samples\Vision;

use Google\Cloud\Vision\V1\ImageAnnotatorClient;

// $path = 'gs://path/to/your/image.jpg'

function detect_object_gcs($path)
{
    $imageAnnotator = new ImageAnnotatorClient();

    # annotate the image
    $response = $imageAnnotator->objectLocalization($path);
    $objects = $response->getLocalizedObjectAnnotations();

    foreach ($objects as $object) {
        $name = $object->getName();
        $score = $object->getScore();
        $vertices = $object->getBoundingPoly()->getNormalizedVertices();

        printf('%s (confidence %d)):' . PHP_EOL, $name, $score);
        print('normalized bounding polygon vertices: ');
        foreach ($vertices as $vertex) {
            printf(' (%d, %d)', $vertex->getX(), $vertex->getY());
        }
        print(PHP_EOL);
    }

    $imageAnnotator->close();
}

Python

Before trying this sample, follow the Python setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Python API reference documentation .

def localize_objects_uri(uri):
    """Localize objects in the image on Google Cloud Storage

    Args:
    uri: The path to the file in Google Cloud Storage (gs://...)
    """
    from google.cloud import vision
    client = vision.ImageAnnotatorClient()

    image = vision.types.Image()
    image.source.image_uri = uri

    objects = client.object_localization(
        image=image).localized_object_annotations

    print('Number of objects found: {}'.format(len(objects)))
    for object_ in objects:
        print('\n{} (confidence: {})'.format(object_.name, object_.score))
        print('Normalized bounding polygon vertices: ')
        for vertex in object_.bounding_poly.normalized_vertices:
            print(' - ({}, {})'.format(vertex.x, vertex.y))

Ruby

Before trying this sample, follow the Ruby setup instructions in the Vision API Quickstart Using Client Libraries . For more information, see the Vision API Ruby API reference documentation .

# image_path = "Google Cloud Storage URI, eg. 'gs://my-bucket/image.png'"

require "google/cloud/vision"

image_annotator = Google::Cloud::Vision::ImageAnnotator.new

response = image_annotator.object_localization_detection image: image_path

response.responses.each do |res|
  res.localized_object_annotations.each do |object|
    puts "#{object.name} (confidence: #{object.score})"
    puts "Normalized bounding polygon vertices:"
    object.bounding_poly.normalized_vertices.each do |vertex|
      puts " - (#{vertex.x}, #{vertex.y})"
    end
  end
end
# image_path = "URI, eg. 'https://site.tld/image.png'"

require "google/cloud/vision"

image_annotator = Google::Cloud::Vision::ImageAnnotator.new

response = image_annotator.object_localization_detection image: image_path

response.responses.each do |res|
  res.localized_object_annotations.each do |object|
    puts "#{object.name} (confidence: #{object.score})"
    puts "Normalized bounding polygon vertices:"
    object.bounding_poly.normalized_vertices.each do |vertex|
      puts " - (#{vertex.x}, #{vertex.y})"
    end
  end
end

GCLOUD COMMAND

To detect labels in an image, use the gcloud ml vision detect-objects command as shown in the following example:

gcloud ml vision detect-objects https://cloud.google.com/vision/docs/images/bicycle_example.png

PowerShell

To make a label detection request using Windows PowerShell, make a POST request to the https://vision.googleapis.com/v1/images:annotate endpoint and specify OBJECT_LOCALIZATION as the value of features.type.

$cred = gcloud auth application-default print-access-token
$headers = @{ Authorization = "Bearer $cred" }

Invoke-WebRequest `
  -Method Post `
  -Headers $headers `
  -ContentType: "application/json; charset=utf-8" `
  -Body "{
      'requests': [
        {
          'image': {
            'source': {
              'imageUri': 'https://cloud.google.com/vision/docs/images/bicycle_example.png'
            }
          },
          'features': [
            {
              'type': 'OBJECT_LOCALIZATION'
            }
          ]
        }
      ]
    }" `
  -Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content

Try it

Try object detection and localization below. You can use the image specified already (https://cloud.google.com/vision/docs/images/bicycle_example.png) or specify your own image in its place. Send the request by selecting Execute.

image without bounding boxes
Оцените, насколько информация на этой странице была вам полезна:

Оставить отзыв о...

Текущей странице
Cloud Vision API Documentation
Нужна помощь? Обратитесь в службу поддержки.