此旧版 AutoML Vision 已弃用，2024 年 3 月 31 日之后在 Google Cloud 上不再提供。旧版 AutoML Vision 的所有功能和新功能均在 Vertex AI 平台上提供。请参阅迁移到 Vertex AI，了解如何迁移资源。

进行批量预测

创建（训练）模型后，您可以使用 batchPredict 方法为一批图片创建异步预测请求。batchPredict 方法基于模型预测的图片的主对象将标签应用于图片。

自正式版发布之日起，自定义模型的最长使用期限为 18 个月。该时间过后，您必须创建并训练新的模型，才能继续为内容添加注释。

批量预测

您可以使用 batchPredict 命令请求对图片进行注释（预测）。batchPredict 命令将存储在您的 Google Cloud Storage 存储桶的 CSV 文件作为输入，该文件包含要添加注释的图片的路径。每行都指定一个指向 Google Cloud Storage 中的一张图片的单独路径。

batch_prediction.csv：

gs://my-cloud-storage-bucket/prediction_files/image1.jpg
gs://my-cloud-storage-bucket/prediction_files/image2.jpg
gs://my-cloud-storage-bucket/prediction_files/image3.jpg
gs://my-cloud-storage-bucket/prediction_files/image4.jpg
gs://my-cloud-storage-bucket/prediction_files/image5.jpg
gs://my-cloud-storage-bucket/prediction_files/image6.png

批量预测任务可能需要一些时间才能完成，具体取决于您在 CSV 文件中指定的图片数量。即使是预测少量图片，批量预测也需要至少 30 分钟才能完成。

REST

在使用任何请求数据之前，请先进行以下替换：

project-id：您的 GCP 项目 ID。
location-id：有效的位置标识符。目前，您必须使用以下值：
- us-central1
model-id：您的模型的 ID（从创建模型时返回的响应中获取）。此 ID 是模型名称的最后一个元素。例如：
- 模型名称：projects/project-id/locations/location-id/models/IOD4412217016962778756
- 模型 ID：IOD4412217016962778756
input-storage-path：存储在 Google Cloud Storage 中的 CSV 文件的路径。发出请求的用户必须至少具有相应存储桶的读取权限。
output-storage-bucket：用于保存输出文件的 Google Cloud Storage 存储桶/目录，采用以下格式表示：gs://bucket/directory/。发出请求的用户必须具有相应存储桶的写入权限。

特定于字段的注意事项：

params.score_threshold - 一个介于 0.0 至 1.0 之间的值。系统将仅返回得分大于或等于此值的结果。

HTTP 方法和网址：

POST https://automl.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/models/MODEL_ID:batchPredict

请求 JSON 正文：

{
  "inputConfig": {
    "gcsSource": {
       "inputUris": [ "INPUT_STORAGE_PATH" ]
    }
  },
  "outputConfig": {
    "gcsDestination": {
      "outputUriPrefix": "OUTPUT_STORAGE_BUCKET"
    }
  },
  "params": {
    "score_threshold": "0.0"
  }
}

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: project-id" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://automl.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/models/MODEL_ID:batchPredict"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "project-id" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://automl.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/models/MODEL_ID:batchPredict" | Select-Object -Expand Content

响应：

您将看到如下所示的输出：

{
  "name": "projects/PROJECT_ID/locations/LOCATION_ID/operations/ICN926615623331479552",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1.OperationMetadata",
    "createTime": "2019-06-19T21:28:35.302067Z",
    "updateTime": "2019-06-19T21:28:35.302067Z",
    "batchPredictDetails": {
      "inputConfig": {
        "gcsSource": {
          "inputUris": [
            "INPUT_STORAGE_PATH"
          ]
        }
      }
    }
  }
}

可以使用操作 ID（本例中为 ICN926615623331479552）来获取任务的状态。如需查看示例，请参阅处理长时间运行的操作。

批量预测任务可能需要一些时间才能完成，具体取决于您在 CSV 文件中指定的图片数量。即使是预测少量图片，批量预测也需要至少 30 分钟才能完成。

操作完成后，state 会显示为 DONE，并且结果会写入您指定的 Google Cloud Storage 文件中：

{
  "name": "projects/PROJECT_ID/locations/LOCATION_ID/operations/ICN926615623331479552",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1.OperationMetadata",
    "createTime": "2019-06-19T21:28:35.302067Z",
    "updateTime": "2019-06-19T21:57:18.310033Z",
    "batchPredictDetails": {
      "inputConfig": {
        "gcsSource": {
          "inputUris": [
            "INPUT_STORAGE_PATH"
          ]
        }
      },
      "outputInfo": {
        "gcsOutputDirectory": "gs://STORAGE_BUCKET_VCM/SUBDIRECTORY/prediction-8370559933346329705-YYYY-MM-DDThh:mm:ss.sssZ"
      }
    }
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.automl.v1.BatchPredictResult"
  }
}

如需查看示例输出文件，请参阅下面的输出 JSONL 文件部分。

Java

在试用此示例之前，请按照客户端库页面中与此编程语言对应的设置说明执行操作。

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.BatchPredictInputConfig;
import com.google.cloud.automl.v1.BatchPredictOutputConfig;
import com.google.cloud.automl.v1.BatchPredictRequest;
import com.google.cloud.automl.v1.BatchPredictResult;
import com.google.cloud.automl.v1.GcsDestination;
import com.google.cloud.automl.v1.GcsSource;
import com.google.cloud.automl.v1.ModelName;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.cloud.automl.v1.PredictionServiceClient;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

abstract class BatchPredict {

  static void batchPredict() throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String modelId = "YOUR_MODEL_ID";
    String inputUri = "gs://YOUR_BUCKET_ID/path_to_your_input_csv_or_jsonl";
    String outputUri = "gs://YOUR_BUCKET_ID/path_to_save_results/";
    batchPredict(projectId, modelId, inputUri, outputUri);
  }

  static void batchPredict(String projectId, String modelId, String inputUri, String outputUri)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServiceClient client = PredictionServiceClient.create()) {
      // Get the full path of the model.
      ModelName name = ModelName.of(projectId, "us-central1", modelId);
      GcsSource gcsSource = GcsSource.newBuilder().addInputUris(inputUri).build();
      BatchPredictInputConfig inputConfig =
          BatchPredictInputConfig.newBuilder().setGcsSource(gcsSource).build();
      GcsDestination gcsDestination =
          GcsDestination.newBuilder().setOutputUriPrefix(outputUri).build();
      BatchPredictOutputConfig outputConfig =
          BatchPredictOutputConfig.newBuilder().setGcsDestination(gcsDestination).build();
      BatchPredictRequest request =
          BatchPredictRequest.newBuilder()
              .setName(name.toString())
              .setInputConfig(inputConfig)
              .setOutputConfig(outputConfig)
              .build();

      OperationFuture<BatchPredictResult, OperationMetadata> future =
          client.batchPredictAsync(request);

      System.out.println("Waiting for operation to complete...");
      future.get();
      System.out.println("Batch Prediction results saved to specified Cloud Storage bucket.");
    }
  }
}

Node.js

在试用此示例之前，请按照客户端库页面中与此编程语言对应的设置说明执行操作。

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const modelId = 'YOUR_MODEL_ID';
// const inputUri = 'gs://YOUR_BUCKET_ID/path_to_your_input_csv_or_jsonl';
// const outputUri = 'gs://YOUR_BUCKET_ID/path_to_save_results/';

// Imports the Google Cloud AutoML library
const {PredictionServiceClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new PredictionServiceClient();

async function batchPredict() {
  // Construct request
  const request = {
    name: client.modelPath(projectId, location, modelId),
    inputConfig: {
      gcsSource: {
        inputUris: [inputUri],
      },
    },
    outputConfig: {
      gcsDestination: {
        outputUriPrefix: outputUri,
      },
    },
  };

  const [operation] = await client.batchPredict(request);

  console.log('Waiting for operation to complete...');
  // Wait for operation to complete.
  const [response] = await operation.promise();
  console.log(
    `Batch Prediction results saved to Cloud Storage bucket. ${response}`
  );
}

batchPredict();

Python

在试用此示例之前，请按照客户端库页面中与此编程语言对应的设置说明执行操作。

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# model_id = "YOUR_MODEL_ID"
# input_uri = "gs://YOUR_BUCKET_ID/path/to/your/input/csv_or_jsonl"
# output_uri = "gs://YOUR_BUCKET_ID/path/to/save/results/"

prediction_client = automl.PredictionServiceClient()

# Get the full path of the model.
model_full_id = f"projects/{project_id}/locations/us-central1/models/{model_id}"

gcs_source = automl.GcsSource(input_uris=[input_uri])

input_config = automl.BatchPredictInputConfig(gcs_source=gcs_source)
gcs_destination = automl.GcsDestination(output_uri_prefix=output_uri)
output_config = automl.BatchPredictOutputConfig(gcs_destination=gcs_destination)

response = prediction_client.batch_predict(
    name=model_full_id, input_config=input_config, output_config=output_config
)

print("Waiting for operation to complete...")
print(
    f"Batch Prediction results saved to Cloud Storage bucket. {response.result()}"
)

其他语言

C#：请按照客户端库页面上的 C# 设置说明操作，然后访问 .NET 版 AutoML Vision 参考文档。

PHP：请按照客户端库页面上的 PHP 设置说明操作，然后访问 PHP 版 AutoML Vision 参考文档。

Ruby 版：请按照客户端库页面上的 Ruby 设置说明操作，然后访问 Ruby 版 AutoML Vision 参考文档。

输出 JSONL 文件

批量预测任务完成后，预测输出结果将存储在您在命令中指定的 Google Cloud Storage 存储桶中。

在您的输出存储桶中（如适用，在您指定的目录中），将创建 image_classification_1.jsonl、image_classification_2.jsonl、…、image_classification_N.jsonl 文件，其中 N 可以是 1，具体取决于成功预测的图片和注释的总数。

单个图片及其所有注释将仅列出一次，并且其注释绝不会被拆分到多个文件中。

在每个 JSONL 文件中，每行都将包含一个 proto 的 JSON 表示法，其中封装了图片的 "ID" : "<id_value>"，并后跟零个或多个已填充分类详细信息的 AnnotationPayload proto（称为 annotations）。

如果对任何图片的（部分或全部）预测失败，则系统会额外创建 errors_1.jsonl、errors_2.jsonl、…、errors_N.jsonl 文件（N 取决于失败预测的总数）。这些文件将包含一个 proto 的 JSON 表示法，其中封装了同一 "ID" : "<id_value>"，但后面仅跟一个只包含 code 和 message 字段的 google.rpc.Status。

JSONL 文件示例：

image_image_classification_0.jsonl-单个.jsonl 文件包含 4 行内容，每行对应一个图片文件注释 JSON。

第 1 行（daisy1.jpg注释 JSON）


{
  "ID": "gs://storage-bucket-vcm/img/daisy1.jpg",
  "annotations": [
    {
      "annotation_spec_id": "daisy",
      "classification": {
        "score": 0.99906391
      },
      "display_name": "daisy"
    },
    {
      "annotation_spec_id": "dandelion",
      "classification": {
        "score": 0.00085875636
      },
      "display_name": "dandelion"
    },
    {
      "annotation_spec_id": "roses",
      "classification": {
        "score": 0.000018997729
      },
      "display_name": "roses"
    },
    {
      "annotation_spec_id": "sunflowers",
      "classification": {
        "score": 0.0000041045291
      },
      "display_name": "sunflowers"
    },
    {
      "annotation_spec_id": "tulips",
      "classification": {
        "score": 0.000039702507
      },
      "display_name": "tulips"
    },
    {
      "annotation_spec_id": "--other--",
      "classification": {
        "score": 0.000014527803
      },
      "display_name": "--other--"
    }
  ]
}

第 2 行（daisy2.jpg 的注释 JSON）


{
  "ID": "gs://storage-bucket-vcm/img/daisy2.jpg",
  "annotations": [
    {
      "annotation_spec_id": "daisy",
      "classification": {
        "score": 0.99953115
      },
      "display_name": "daisy"
    },
    {
      "annotation_spec_id": "dandelion",
      "classification": {
        "score": 0.00014155755
      },
      "display_name": "dandelion"
    },
    {
      "annotation_spec_id": "roses",
      "classification": {
        "score": 0.000011171558
      },
      "display_name": "roses"
    },
    {
      "annotation_spec_id": "sunflowers",
      "classification": {
        "score": 0.00030725187
      },
      "display_name": "sunflowers"
    },
    {
      "annotation_spec_id": "tulips",
      "classification": {
        "score": 7.7882828e-7
      },
      "display_name": "tulips"
    },
    {
      "annotation_spec_id": "--other--",
      "classification": {
        "score": 0.0000081920462
      },
      "display_name": "--other--"
    }
  ]
}

第 3 行（dandelion1.jpg 的注释 JSON）


{
  "ID": "gs://storage-bucket-vcm/img/dandelion1.jpg",
  "annotations": [
    {
      "annotation_spec_id": "daisy",
      "classification": {
        "score": 0.0000041204139
      },
      "display_name": "daisy"
    },
    {
      "annotation_spec_id": "dandelion",
      "classification": {
        "score": 0.99971503
      },
      "display_name": "dandelion"
    },
    {
      "annotation_spec_id": "roses",
      "classification": {
        "score": 4.9584577e-7
      },
      "display_name": "roses"
    },
    {
      "annotation_spec_id": "sunflowers",
      "classification": {
        "score": 0.00027974427
      },
      "display_name": "sunflowers"
    },
    {
      "annotation_spec_id": "tulips",
      "classification": {
        "score": 3.8392983e-7
      },
      "display_name": "tulips"
    },
    {
      "annotation_spec_id": "--other--",
      "classification": {
        "score": 2.6729541e-7
      },
      "display_name": "--other--"
    }
  ]
}

第 4 行（dandelion2.jpg的 JSON 注释）


{
  "ID": "gs://automl-batch-iod-vcm/img/dandelion2.jpg",
  "annotations": [
    {
      "annotation_spec_id": "daisy",
      "classification": {
        "score": 0.00023957422
      },
      "display_name": "daisy"
    },
    {
      "annotation_spec_id": "dandelion",
      "classification": {
        "score": 0.99976045
      },
      "display_name": "dandelion"
    },
    {
      "annotation_spec_id": "roses",
      "classification": {
        "score": 1.7562879e-8
      },
      "display_name": "roses"
    },
    {
      "annotation_spec_id": "sunflowers",
      "classification": {
        "score": 3.2643279e-9
      },
      "display_name": "sunflowers"
    },
    {
      "annotation_spec_id": "tulips",
      "classification": {
        "score": 1.3378423e-8
      },
      "display_name": "tulips"
    },
    {
      "annotation_spec_id": "--other--",
      "classification": {
        "score": 4.6433613e-9
      },
      "display_name": "--other--"
    }
  ]
}

处理长时间运行的操作

REST

在使用任何请求数据之前，请先进行以下替换：

project-id：您的 GCP 项目 ID。
operation-id：您的操作的 ID。此 ID 是操作名称的最后一个元素。例如：
- 操作名称：projects/project-id/locations/location-id/operations/IOD5281059901324392598
- 操作 ID：IOD5281059901324392598

HTTP 方法和网址：

GET https://automl.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID

如需发送请求，请选择以下方式之一：

curl

执行以下命令：

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: project-id" \
     "https://automl.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID"

PowerShell

执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "project-id" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://automl.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID" | Select-Object -Expand Content

完成导入操作后，您应该会看到类似如下所示的输出：

{
  "name": "projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1.OperationMetadata",
    "createTime": "2018-10-29T15:56:29.176485Z",
    "updateTime": "2018-10-29T16:10:41.326614Z",
    "importDataDetails": {}
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.protobuf.Empty"
  }
}

完成创建模型操作后，您应会看到如下输出：

{
  "name": "projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1.OperationMetadata",
    "createTime": "2019-07-22T18:35:06.881193Z",
    "updateTime": "2019-07-22T19:58:44.972235Z",
    "createModelDetails": {}
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.automl.v1.Model",
    "name": "projects/PROJECT_ID/locations/us-central1/models/MODEL_ID"
  }
}

Go

在试用此示例之前，请按照 API 与参考文档 > 客户端库页面上与此编程语言对应的设置说明进行操作。

import (
	"context"
	"fmt"
	"io"

	automl "cloud.google.com/go/automl/apiv1"
	"cloud.google.com/go/automl/apiv1/automlpb"
)

// getOperationStatus gets an operation's status.
func getOperationStatus(w io.Writer, projectID string, location string, datasetID string, modelName string) error {
	// projectID := "my-project-id"
	// location := "us-central1"
	// datasetID := "ICN123456789..."
	// modelName := "model_display_name"

	ctx := context.Background()
	client, err := automl.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("NewClient: %w", err)
	}
	defer client.Close()

	req := &automlpb.CreateModelRequest{
		Parent: fmt.Sprintf("projects/%s/locations/%s", projectID, location),
		Model: &automlpb.Model{
			DisplayName: modelName,
			DatasetId:   datasetID,
			ModelMetadata: &automlpb.Model_ImageClassificationModelMetadata{
				ImageClassificationModelMetadata: &automlpb.ImageClassificationModelMetadata{
					TrainBudgetMilliNodeHours: 1000, // 1000 milli-node hours are 1 hour
				},
			},
		},
	}

	op, err := client.CreateModel(ctx, req)
	if err != nil {
		return err
	}
	fmt.Fprintf(w, "Name: %v\n", op.Name())

	// Wait for the longrunning operation complete.
	resp, err := op.Wait(ctx)
	if err != nil && !op.Done() {
		fmt.Println("failed to fetch operation status", err)
		return err
	}
	if err != nil && op.Done() {
		fmt.Println("operation completed with error", err)
		return err
	}
	fmt.Fprintf(w, "Response: %v\n", resp)

	return nil
}

Java

在试用此示例之前，请按照 API 与参考文档 > 客户端库页面上与此编程语言对应的设置说明进行操作。

import com.google.cloud.automl.v1.AutoMlClient;
import com.google.longrunning.Operation;
import java.io.IOException;

class GetOperationStatus {

  static void getOperationStatus() throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String operationFullId = "projects/[projectId]/locations/us-central1/operations/[operationId]";
    getOperationStatus(operationFullId);
  }

  // Get the status of an operation
  static void getOperationStatus(String operationFullId) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // Get the latest state of a long-running operation.
      Operation operation = client.getOperationsClient().getOperation(operationFullId);

      // Display operation details.
      System.out.println("Operation details:");
      System.out.format("\tName: %s\n", operation.getName());
      System.out.format("\tMetadata Type Url: %s\n", operation.getMetadata().getTypeUrl());
      System.out.format("\tDone: %s\n", operation.getDone());
      if (operation.hasResponse()) {
        System.out.format("\tResponse Type Url: %s\n", operation.getResponse().getTypeUrl());
      }
      if (operation.hasError()) {
        System.out.println("\tResponse:");
        System.out.format("\t\tError code: %s\n", operation.getError().getCode());
        System.out.format("\t\tError message: %s\n", operation.getError().getMessage());
      }
    }
  }
}

Node.js

在试用此示例之前，请按照 API 与参考文档 > 客户端库页面上与此编程语言对应的设置说明进行操作。

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const operationId = 'YOUR_OPERATION_ID';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function getOperationStatus() {
  // Construct request
  const request = {
    name: `projects/${projectId}/locations/${location}/operations/${operationId}`,
  };

  const [response] = await client.operationsClient.getOperation(request);

  console.log(`Name: ${response.name}`);
  console.log('Operation details:');
  console.log(`${response}`);
}

getOperationStatus();

Python

在试用此示例之前，请按照 API 与参考文档 > 客户端库页面上与此编程语言对应的设置说明进行操作。

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# operation_full_id = \
#     "projects/[projectId]/locations/us-central1/operations/[operationId]"

client = automl.AutoMlClient()
# Get the latest state of a long-running operation.
response = client._transport.operations_client.get_operation(operation_full_id)

print(f"Name: {response.name}")
print("Operation details:")
print(response)

其他语言

C#：请按照客户端库页面上的 C# 设置说明操作，然后访问 .NET 版 AutoML Vision 参考文档。

PHP：请按照客户端库页面上的 PHP 设置说明操作，然后访问 PHP 版 AutoML Vision 参考文档。

Ruby 版：请按照客户端库页面上的 Ruby 设置说明操作，然后访问 Ruby 版 AutoML Vision 参考文档。