Mehrere Objekte erkennen

Mit der Objektlokalisierung der Vision API lassen sich mehrere Objekte in einem Bild erkennen und extrahieren.

Die Objektlokalisierung kann in einem Bild mehrere Objekte identifizieren und eine LocalizedObjectAnnotation für jedes Objekt im Bild bereitstellen. Mit jeder LocalizedObjectAnnotation lassen sich Informationen über das Objekt, die Position des Objekts und die rechteckigen Begrenzungen für den Bereich des Bildes ermitteln, in dem sich das Objekt befindet.

Die Objektlokalisierung identifiziert sowohl wichtige als auch weniger wichtige Objekte in einem Bild.

Objektinformationen werden nur auf Englisch zurückgegeben. Mit Cloud Translation lassen sich englische Labels in verschiedene Sprachen übersetzen.

Bild mit Begrenzungsrahmen — *Bildnachweis:* Bogdan Dada auf Unsplash (*Anmerkungen hinzugefügt*).

Die API gibt beispielsweise die folgenden Informationen und Begrenzungsstandortdaten für die Objekte im vorherigen Bild zurück:

Name	mid	Bewertung	Grenzwerte
Laufrad	/m/01bqk0	0,89648587	(0,32076266, 0,78941387), (0,43812272, 0,78941387), (0,43812272, 0,97331065), (0,32076266, 0,97331065)
Fahrrad	/m/0199g	0,886761	(0,312, 0,6616471), (0,638353, 0,6616471), (0,638353, 0,9705882), (0,312, 0,9705882)
Laufrad	/m/01bqk0	0,6345275	(0,5125398, 0,760708), (0,6256646, 0,760708), (0,6256646, 0,94601655), (0,5125398, 0,94601655)
Bilderrahmen	/m/06z37_	0,6207608	(0,79177403, 0,16160682), (0,97047985, 0,16160682), (0,97047985, 0,31348917), (0,79177403, 0,31348917)
Reifen	/m/0h9mv	0,55886006	(0,32076266, 0,78941387), (0,43812272, 0,78941387), (0,43812272, 0,97331065), (0,32076266, 0,97331065)
Tür	/m/02dgv	0,5160098	(0,77569866, 0,37104446), (0,9412425, 0,37104446), (0,9412425, 0,81507325), (0,77569866, 0,81507325)

mid enthält die maschinengenerierte Kennzeichnung (Machine-generated Identifier, MID), die dem Google Knowledge Graph-Eintrag des Labels entspricht. Weitere Informationen zur Untersuchung dieser MID-Werte finden Sie in der Dokumentation zur Google Knowledge Graph Search API.

Überzeugen Sie sich selbst

Wenn Sie mit Google Cloud noch nicht vertraut sind, erstellen Sie ein Konto, um sich von der Leistungsfähigkeit der Cloud Vision API in der Praxis zu überzeugen. Neukunden erhalten außerdem ein Guthaben von 300 $, um Arbeitslasten auszuführen, zu testen und bereitzustellen.

Cloud Vision API kostenlos testen

Objektlokalisierungsanfragen

Google Cloud -Projekt und Authentifizierung einrichten

Wenn Sie noch kein Google Cloud Projekt erstellt haben, tun Sie dies jetzt. Maximieren Sie diesen Abschnitt, um die Anleitung einzublenden.

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vision API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

Install the Google Cloud CLI.

Wenn Sie einen externen Identitätsanbieter (IdP) verwenden, müssen Sie sich zuerst mit Ihrer föderierten Identität in der gcloud CLI anmelden.

Führen Sie folgenden Befehl aus, um die gcloud CLI zu initialisieren:

gcloud init

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vision API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

Install the Google Cloud CLI.

Wenn Sie einen externen Identitätsanbieter (IdP) verwenden, müssen Sie sich zuerst mit Ihrer föderierten Identität in der gcloud CLI anmelden.

Führen Sie folgenden Befehl aus, um die gcloud CLI zu initialisieren:

gcloud init

Objekte in einem lokalen Bild erkennen

Sie können die Vision API für die Erkennung von Features in einer lokalen Bilddatei verwenden.

Senden Sie bei REST-Anfragen den Inhalt der Bilddatei als base64-codierten String im Text Ihrer Anfrage.

Geben Sie für Anfragen mit gcloud und Clientbibliotheken den Pfad zu einem lokalen Bild in Ihrer Anfrage an.

Hinweis: Dieses Feature gibt Ergebnisse mit normalizedVertices [0,1] und nicht mit echten Pixelwerten (vertices) zurück.

REST

Ersetzen Sie folgende Werte in den Anfragedaten:

BASE64_ENCODED_IMAGE: die Base64-Darstellung (ASCII-String) der Binärbilddaten. Dieser String sollte in etwa so aussehen:
- /9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
Weitere Informationen erhalten Sie unter Base64-Codierung.
RESULTS_INT: (optional) ein ganzzahliger Wert der Ergebnisse, die zurückgegeben werden sollen. Wenn Sie das Feld "maxResults" und seinen Wert weglassen, gibt die API den Standardwert von 10 Ergebnissen zurück. Dieses Feld gilt nicht für die folgenden Featuretypen: TEXT_DETECTION, DOCUMENT_TEXT_DETECTION oder CROP_HINTS.
PROJECT_ID: Ihre Google Cloud Projekt-ID

HTTP-Methode und URL:

POST https://vision.googleapis.com/v1/images:annotate

JSON-Text der Anfrage:

{
  "requests": [
    {
      "image": {
        "content": "BASE64_ENCODED_IMAGE"
      },
      "features": [
        {
          "maxResults": RESULTS_INT,
          "type": "OBJECT_LOCALIZATION"
        },
      ]
    }
  ]
}

Wenn Sie die Anfrage senden möchten, wählen Sie eine der folgenden Optionen aus:

curl

Hinweis: Der folgende Befehl setzt voraus, dass Sie sich mit Ihrem Nutzerkonto in der gcloud-Befehlszeile angemeldet haben, indem Sie gcloud init oder gcloud auth login ausgeführt oder die Cloud Shell genutzt haben, die Sie automatisch in der gcloud-Befehlszeile anmeldet. Um zu prüfen, welches Konto gerade aktiv ist, führen Sie gcloud auth list aus.

Speichern Sie den Anfragetext in einer Datei mit dem Namen request.json und führen Sie den folgenden Befehl aus:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://vision.googleapis.com/v1/images:annotate"

PowerShell

Hinweis: Der folgende Befehl setzt voraus, dass Sie sich mit Ihrem Nutzerkonto in der gcloud-Befehlszeile angemeldet haben, indem Sie gcloud init oder gcloud auth login ausgeführt haben. Um zu prüfen, welches Konto gerade aktiv ist, führen Sie gcloud auth list aus.

Speichern Sie den Anfragetext in einer Datei mit dem Namen request.json und führen Sie den folgenden Befehl aus:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content

Wenn die Anfrage erfolgreich ist, gibt der Server den HTTP-Statuscode 200 OK und die Antwort im JSON-Format zurück.

Antwort:

Antwort

{
  "responses": [
    {
      "localizedObjectAnnotations": [
        {
          "mid": "/m/01bqk0",
          "name": "Bicycle wheel",
          "score": 0.89648587,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.32076266,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.97331065
              },
              {
                "x": 0.32076266,
                "y": 0.97331065
              }
            ]
          }
        },
        {
          "mid": "/m/0199g",
          "name": "Bicycle",
          "score": 0.886761,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.312,
                "y": 0.6616471
              },
              {
                "x": 0.638353,
                "y": 0.6616471
              },
              {
                "x": 0.638353,
                "y": 0.9705882
              },
              {
                "x": 0.312,
                "y": 0.9705882
              }
            ]
          }
        },
        {
          "mid": "/m/01bqk0",
          "name": "Bicycle wheel",
          "score": 0.6345275,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.5125398,
                "y": 0.760708
              },
              {
                "x": 0.6256646,
                "y": 0.760708
              },
              {
                "x": 0.6256646,
                "y": 0.94601655
              },
              {
                "x": 0.5125398,
                "y": 0.94601655
              }
            ]
          }
        },
        {
          "mid": "/m/06z37_",
          "name": "Picture frame",
          "score": 0.6207608,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.79177403,
                "y": 0.16160682
              },
              {
                "x": 0.97047985,
                "y": 0.16160682
              },
              {
                "x": 0.97047985,
                "y": 0.31348917
              },
              {
                "x": 0.79177403,
                "y": 0.31348917
              }
            ]
          }
        },
        {
          "mid": "/m/0h9mv",
          "name": "Tire",
          "score": 0.55886006,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.32076266,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.97331065
              },
              {
                "x": 0.32076266,
                "y": 0.97331065
              }
            ]
          }
        },
        {
          "mid": "/m/02dgv",
          "name": "Door",
          "score": 0.5160098,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.77569866,
                "y": 0.37104446
              },
              {
                "x": 0.9412425,
                "y": 0.37104446
              },
              {
                "x": 0.9412425,
                "y": 0.81507325
              },
              {
                "x": 0.77569866,
                "y": 0.81507325
              }
            ]
          }
        }
      ]
    }
  ]
}

Go

Bevor Sie dieses Beispiel ausprobieren, folgen Sie der Go-Einrichtungsanleitung in der Vision-Kurzanleitung zur Verwendung von Clientbibliotheken. Weitere Informationen finden Sie in der Go-Referenzdokumentation zur Vision API.

Richten Sie zur Authentifizierung bei Vision die Standardanmeldedaten für Anwendungen (ADC) ein. Weitere Informationen finden Sie unter ADC für eine lokale Entwicklungsumgebung einrichten.


// localizeObjects gets objects and bounding boxes from the Vision API for an image at the given file path.
func localizeObjects(w io.Writer, file string) error {
	ctx := context.Background()

	client, err := vision.NewImageAnnotatorClient(ctx)
	if err != nil {
		return err
	}

	f, err := os.Open(file)
	if err != nil {
		return err
	}
	defer f.Close()

	image, err := vision.NewImageFromReader(f)
	if err != nil {
		return err
	}
	annotations, err := client.LocalizeObjects(ctx, image, nil)
	if err != nil {
		return err
	}

	if len(annotations) == 0 {
		fmt.Fprintln(w, "No objects found.")
		return nil
	}

	fmt.Fprintln(w, "Objects:")
	for _, annotation := range annotations {
		fmt.Fprintln(w, annotation.Name)
		fmt.Fprintln(w, annotation.Score)

		for _, v := range annotation.BoundingPoly.NormalizedVertices {
			fmt.Fprintf(w, "(%f,%f)\n", v.X, v.Y)
		}
	}

	return nil
}

Java

Bevor Sie dieses Beispiel ausprobieren, folgen Sie der Einrichtungsanleitung für Java in der Vision API-Kurzanleitung zur Verwendung von Clientbibliotheken. Weitere Informationen finden Sie in der Java-Referenzdokumentation zur Vision API.

/**
 * Detects localized objects in the specified local image.
 *
 * @param filePath The path to the file to perform localized object detection on.
 * @throws Exception on errors while closing the client.
 * @throws IOException on Input/Output errors.
 */
public static void detectLocalizedObjects(String filePath) throws IOException {
  List<AnnotateImageRequest> requests = new ArrayList<>();

  ByteString imgBytes = ByteString.readFrom(new FileInputStream(filePath));

  Image img = Image.newBuilder().setContent(imgBytes).build();
  AnnotateImageRequest request =
      AnnotateImageRequest.newBuilder()
          .addFeatures(Feature.newBuilder().setType(Type.OBJECT_LOCALIZATION))
          .setImage(img)
          .build();
  requests.add(request);

  // Initialize client that will be used to send requests. This client only needs to be created
  // once, and can be reused for multiple requests. After completing all of your requests, call
  // the "close" method on the client to safely clean up any remaining background resources.
  try (ImageAnnotatorClient client = ImageAnnotatorClient.create()) {
    // Perform the request
    BatchAnnotateImagesResponse response = client.batchAnnotateImages(requests);
    List<AnnotateImageResponse> responses = response.getResponsesList();

    // Display the results
    for (AnnotateImageResponse res : responses) {
      for (LocalizedObjectAnnotation entity : res.getLocalizedObjectAnnotationsList()) {
        System.out.format("Object name: %s%n", entity.getName());
        System.out.format("Confidence: %s%n", entity.getScore());
        System.out.format("Normalized Vertices:%n");
        entity
            .getBoundingPoly()
            .getNormalizedVerticesList()
            .forEach(vertex -> System.out.format("- (%s, %s)%n", vertex.getX(), vertex.getY()));
      }
    }
  }
}

Node.js

Bevor Sie dieses Beispiel ausprobieren, folgen Sie der Einrichtungsanleitung für Node.js in der Vision-Kurzanleitung zur Verwendung von Clientbibliotheken. Weitere Informationen finden Sie in der Node.js-Referenzdokumentation zur Vision API.

Richten Sie zur Authentifizierung bei Vision die Standardanmeldedaten für Anwendungen (ADC) ein. Weitere Informationen finden Sie unter ADC für eine lokale Entwicklungsumgebung einrichten.

// Imports the Google Cloud client libraries
const vision = require('@google-cloud/vision');
const fs = require('fs');

// Creates a client
const client = new vision.ImageAnnotatorClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const fileName = `/path/to/localImage.png`;
const request = {
  image: {content: fs.readFileSync(fileName)},
};

const [result] = await client.objectLocalization(request);
const objects = result.localizedObjectAnnotations;
objects.forEach(object => {
  console.log(`Name: ${object.name}`);
  console.log(`Confidence: ${object.score}`);
  const vertices = object.boundingPoly.normalizedVertices;
  vertices.forEach(v => console.log(`x: ${v.x}, y:${v.y}`));
});

Python

Bevor Sie dieses Beispiel ausprobieren, folgen Sie der Einrichtungsanleitung für Python in der Vision-Kurzanleitung zur Verwendung von Clientbibliotheken. Weitere Informationen finden Sie in der Python-Referenzdokumentation zur Vision API.

Richten Sie zur Authentifizierung bei Vision die Standardanmeldedaten für Anwendungen (ADC) ein. Weitere Informationen finden Sie unter ADC für eine lokale Entwicklungsumgebung einrichten.

def localize_objects(path):
    """Localize objects in the local image.

    Args:
    path: The path to the local file.
    """
    from google.cloud import vision

    client = vision.ImageAnnotatorClient()

    with open(path, "rb") as image_file:
        content = image_file.read()
    image = vision.Image(content=content)

    objects = client.object_localization(image=image).localized_object_annotations

    print(f"Number of objects found: {len(objects)}")
    for object_ in objects:
        print(f"\n{object_.name} (confidence: {object_.score})")
        print("Normalized bounding polygon vertices: ")
        for vertex in object_.bounding_poly.normalized_vertices:
            print(f" - ({vertex.x}, {vertex.y})")

Weitere Sprachen

C#: Folgen Sie der Einrichtungsanleitung für C# auf der Seite der Clientbibliotheken und rufen Sie dann die Vision-Referenzdokumentation für .NET auf.

PHP: Folgen Sie der Einrichtungsanleitung für PHP auf der Seite der Clientbibliotheken und rufen Sie dann die Vision-Referenzdokumentation für PHP auf.

Ruby: Folgen Sie der Einrichtungsanleitung für Ruby auf der Seite der Clientbibliotheken und rufen Sie dann die Vision-Referenzdokumentation für Ruby auf.

Objekte in einem Remote-Bild erkennen

Sie können die Vision API für die Erkennung von Features in einer Remote-Bilddatei verwenden, die sich in Cloud Storage oder im Web befindet. Zum Senden einer Remote-Dateianfrage geben Sie die URL oder den Cloud Storage-URI der Datei im Anfragetext an.

Hinweis: Dieses Feature gibt Ergebnisse mit normalizedVertices [0,1] und nicht mit echten Pixelwerten (vertices) zurück.

REST

Ersetzen Sie folgende Werte in den Anfragedaten:

CLOUD_STORAGE_IMAGE_URI: der Pfad zu einer gültigen Bilddatei in einem Cloud Storage-Bucket. Sie müssen zumindest Leseberechtigungen für die Datei haben. Beispiel:
- ```
https://cloud.google.com/vision/docs/images/bicycle_example.png
```
RESULTS_INT: (optional) ein ganzzahliger Wert der Ergebnisse, die zurückgegeben werden sollen. Wenn Sie das Feld "maxResults" und seinen Wert weglassen, gibt die API den Standardwert von 10 Ergebnissen zurück. Dieses Feld gilt nicht für die folgenden Featuretypen: TEXT_DETECTION, DOCUMENT_TEXT_DETECTION oder CROP_HINTS.
PROJECT_ID: Ihre Google Cloud Projekt-ID

HTTP-Methode und URL:

POST https://vision.googleapis.com/v1/images:annotate

JSON-Text der Anfrage:

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": "CLOUD_STORAGE_IMAGE_URI"
        }
      },
      "features": [
        {
          "maxResults": RESULTS_INT,
          "type": "OBJECT_LOCALIZATION"
        },
      ]
    }
  ]
}

Wenn Sie die Anfrage senden möchten, wählen Sie eine der folgenden Optionen aus:

curl

Speichern Sie den Anfragetext in einer Datei mit dem Namen request.json und führen Sie den folgenden Befehl aus:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://vision.googleapis.com/v1/images:annotate"

PowerShell

Hinweis: Der folgende Befehl setzt voraus, dass Sie sich mit Ihrem Nutzerkonto in der gcloud-Befehlszeile angemeldet haben, indem Sie gcloud init oder gcloud auth login ausgeführt haben. Um zu prüfen, welches Konto gerade aktiv ist, führen Sie gcloud auth list aus.

Speichern Sie den Anfragetext in einer Datei mit dem Namen request.json und führen Sie den folgenden Befehl aus:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content

Wenn die Anfrage erfolgreich ist, gibt der Server den HTTP-Statuscode 200 OK und die Antwort im JSON-Format zurück.

Antwort:

Antwort

{
  "responses": [
    {
      "localizedObjectAnnotations": [
        {
          "mid": "/m/01bqk0",
          "name": "Bicycle wheel",
          "score": 0.89648587,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.32076266,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.97331065
              },
              {
                "x": 0.32076266,
                "y": 0.97331065
              }
            ]
          }
        },
        {
          "mid": "/m/0199g",
          "name": "Bicycle",
          "score": 0.886761,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.312,
                "y": 0.6616471
              },
              {
                "x": 0.638353,
                "y": 0.6616471
              },
              {
                "x": 0.638353,
                "y": 0.9705882
              },
              {
                "x": 0.312,
                "y": 0.9705882
              }
            ]
          }
        },
        {
          "mid": "/m/01bqk0",
          "name": "Bicycle wheel",
          "score": 0.6345275,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.5125398,
                "y": 0.760708
              },
              {
                "x": 0.6256646,
                "y": 0.760708
              },
              {
                "x": 0.6256646,
                "y": 0.94601655
              },
              {
                "x": 0.5125398,
                "y": 0.94601655
              }
            ]
          }
        },
        {
          "mid": "/m/06z37_",
          "name": "Picture frame",
          "score": 0.6207608,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.79177403,
                "y": 0.16160682
              },
              {
                "x": 0.97047985,
                "y": 0.16160682
              },
              {
                "x": 0.97047985,
                "y": 0.31348917
              },
              {
                "x": 0.79177403,
                "y": 0.31348917
              }
            ]
          }
        },
        {
          "mid": "/m/0h9mv",
          "name": "Tire",
          "score": 0.55886006,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.32076266,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.78941387
              },
              {
                "x": 0.43812272,
                "y": 0.97331065
              },
              {
                "x": 0.32076266,
                "y": 0.97331065
              }
            ]
          }
        },
        {
          "mid": "/m/02dgv",
          "name": "Door",
          "score": 0.5160098,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.77569866,
                "y": 0.37104446
              },
              {
                "x": 0.9412425,
                "y": 0.37104446
              },
              {
                "x": 0.9412425,
                "y": 0.81507325
              },
              {
                "x": 0.77569866,
                "y": 0.81507325
              }
            ]
          }
        }
      ]
    }
  ]
}

Go

Richten Sie zur Authentifizierung bei Vision die Standardanmeldedaten für Anwendungen (ADC) ein. Weitere Informationen finden Sie unter ADC für eine lokale Entwicklungsumgebung einrichten.


// localizeObjects gets objects and bounding boxes from the Vision API for an image at the given file path.
func localizeObjectsURI(w io.Writer, file string) error {
	ctx := context.Background()

	client, err := vision.NewImageAnnotatorClient(ctx)
	if err != nil {
		return err
	}

	image := vision.NewImageFromURI(file)
	annotations, err := client.LocalizeObjects(ctx, image, nil)
	if err != nil {
		return err
	}

	if len(annotations) == 0 {
		fmt.Fprintln(w, "No objects found.")
		return nil
	}

	fmt.Fprintln(w, "Objects:")
	for _, annotation := range annotations {
		fmt.Fprintln(w, annotation.Name)
		fmt.Fprintln(w, annotation.Score)

		for _, v := range annotation.BoundingPoly.NormalizedVertices {
			fmt.Fprintf(w, "(%f,%f)\n", v.X, v.Y)
		}
	}

	return nil
}

Java

/**
 * Detects localized objects in a remote image on Google Cloud Storage.
 *
 * @param gcsPath The path to the remote file on Google Cloud Storage to detect localized objects
 *     on.
 * @throws Exception on errors while closing the client.
 * @throws IOException on Input/Output errors.
 */
public static void detectLocalizedObjectsGcs(String gcsPath) throws IOException {
  List<AnnotateImageRequest> requests = new ArrayList<>();

  ImageSource imgSource = ImageSource.newBuilder().setGcsImageUri(gcsPath).build();
  Image img = Image.newBuilder().setSource(imgSource).build();

  AnnotateImageRequest request =
      AnnotateImageRequest.newBuilder()
          .addFeatures(Feature.newBuilder().setType(Type.OBJECT_LOCALIZATION))
          .setImage(img)
          .build();
  requests.add(request);

  // Initialize client that will be used to send requests. This client only needs to be created
  // once, and can be reused for multiple requests. After completing all of your requests, call
  // the "close" method on the client to safely clean up any remaining background resources.
  try (ImageAnnotatorClient client = ImageAnnotatorClient.create()) {
    // Perform the request
    BatchAnnotateImagesResponse response = client.batchAnnotateImages(requests);
    List<AnnotateImageResponse> responses = response.getResponsesList();
    client.close();
    // Display the results
    for (AnnotateImageResponse res : responses) {
      for (LocalizedObjectAnnotation entity : res.getLocalizedObjectAnnotationsList()) {
        System.out.format("Object name: %s%n", entity.getName());
        System.out.format("Confidence: %s%n", entity.getScore());
        System.out.format("Normalized Vertices:%n");
        entity
            .getBoundingPoly()
            .getNormalizedVerticesList()
            .forEach(vertex -> System.out.format("- (%s, %s)%n", vertex.getX(), vertex.getY()));
      }
    }
  }
}

Node.js

Richten Sie zur Authentifizierung bei Vision die Standardanmeldedaten für Anwendungen (ADC) ein. Weitere Informationen finden Sie unter ADC für eine lokale Entwicklungsumgebung einrichten.

// Imports the Google Cloud client libraries
const vision = require('@google-cloud/vision');

// Creates a client
const client = new vision.ImageAnnotatorClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const gcsUri = `gs://bucket/bucketImage.png`;

const [result] = await client.objectLocalization(gcsUri);
const objects = result.localizedObjectAnnotations;
objects.forEach(object => {
  console.log(`Name: ${object.name}`);
  console.log(`Confidence: ${object.score}`);
  const veritices = object.boundingPoly.normalizedVertices;
  veritices.forEach(v => console.log(`x: ${v.x}, y:${v.y}`));
});

Python

Richten Sie zur Authentifizierung bei Vision die Standardanmeldedaten für Anwendungen (ADC) ein. Weitere Informationen finden Sie unter ADC für eine lokale Entwicklungsumgebung einrichten.

def localize_objects_uri(uri):
    """Localize objects in the image on Google Cloud Storage

    Args:
    uri: The path to the file in Google Cloud Storage (gs://...)
    """
    from google.cloud import vision

    client = vision.ImageAnnotatorClient()

    image = vision.Image()
    image.source.image_uri = uri

    objects = client.object_localization(image=image).localized_object_annotations

    print(f"Number of objects found: {len(objects)}")
    for object_ in objects:
        print(f"\n{object_.name} (confidence: {object_.score})")
        print("Normalized bounding polygon vertices: ")
        for vertex in object_.bounding_poly.normalized_vertices:
            print(f" - ({vertex.x}, {vertex.y})")

gcloud

Verwenden Sie für die Erkennung von Labels in einem Bild den Befehl gcloud ml vision detect-objects, wie im folgenden Beispiel gezeigt:

gcloud ml vision detect-objects https://cloud.google.com/vision/docs/images/bicycle_example.png

Weitere Sprachen

C#: Folgen Sie der Einrichtungsanleitung für C# auf der Seite der Clientbibliotheken und rufen Sie dann die Vision-Referenzdokumentation für .NET auf.

PHP: Folgen Sie der Einrichtungsanleitung für PHP auf der Seite der Clientbibliotheken und rufen Sie dann die Vision-Referenzdokumentation für PHP auf.

Ruby: Folgen Sie der Einrichtungsanleitung für Ruby auf der Seite der Clientbibliotheken und rufen Sie dann die Vision-Referenzdokumentation für Ruby auf.

Jetzt testen

Probieren Sie die Objekterkennung und -lokalisierung mit dem folgenden Tool aus. Sie können das bereits angegebene Bild verwenden (https://cloud.google.com/vision/docs/images/bicycle_example.png) oder stattdessen ein eigenes Bild angeben. Wählen Sie zum Senden der Anfrage Ausführen aus.

Bild ohne Begrenzungsrahmen — *Bildnachweis:* Bogdan Dada auf Unsplash.

Anfragetext:

{
  "requests": [
    {
      "features": [
        {
          "maxResults": 10,
          "type": "OBJECT_LOCALIZATION"
        }
      ],
      "image": {
        "source": {
          "imageUri": "https://cloud.google.com/vision/docs/images/bicycle_example.png"
        }
      }
    }
  ]
}