此页面由 Cloud Translation API 翻译。

将模型导入 Vertex AI

本指南介绍如何将模型导入 Model Registry。导入模型后，它会显示在 Model Registry 中。在 Model Registry 中，您可以将导入的模型部署到端点并运行预测。

所需的角色

如需获得导入模型所需的权限，请让您的管理员为您授予项目的 Vertex AI User (roles/aiplatform.user) IAM 角色。如需详细了解如何授予角色，请参阅管理对项目、文件夹和组织的访问权限。

您也可以通过自定义角色或其他预定义角色来获取所需的权限。

预构建或自定义容器

导入模型时，您可以将其与 Vertex AI 相关联以运行预测请求。您可以使用 Vertex AI 提供的预构建容器，也可以使用您自己构建并推送到 Artifact Registry 的自定义容器。

如果您的模型符合以下要求，则可以使用预构建容器：

在 Python 3.7 或更高版本中训练
使用 TensorFlow、PyTorch、scikit-learn 或 XGBoost 训练
导出以满足某个预构建预测容器的特定于框架的要求

如果要导入您之前导出的表格 AutoML 模型，则必须使用 Vertex AI 提供的特定自定义容器。

否则，请创建新的自定义容器，或使用 Artifact Registry 中的现有自定义容器。

将模型工件上传到 Cloud Storage

您必须将模型工件存储在 Cloud Storage 存储桶中，存储桶所在区域与您使用的区域端点一致。

如果您的 Cloud Storage 存储桶位于其他 Google Cloud 项目中，则需要授予 Vertex AI 访问权限以读取模型工件。

如果您使用的是预构建容器，请确保模型工件的文件名与以下示例完全匹配：

TensorFlow SavedModel：saved_model.pb
PyTorch：model.mar
scikit-learn：model.joblib 或 model.pkl
XGBoost：model.bst、model.joblib 或 model.pkl

详细了解如何导出模型工件以进行预测。

使用 Google Cloud 控制台导入模型

如需使用 Google Cloud 控制台导入模型，请执行以下操作：

在 Google Cloud 控制台中，转到 Vertex AI 模型页面。

转到“模型”页面
点击导入。
选择作为新模型导入，以导入新模型。
选择作为新版本导入，以将模型作为现有模型的一个版本导入。如需详细了解模型版本控制，请参阅模型版本控制。
名称和区域：输入模型的名称。选择同时匹配您的存储桶所在区域和您使用的 Vertex AI 区域端点的区域。点击继续。
如果您展开高级选项，则可以选择添加客户管理的加密密钥。

根据您使用的容器类型，在下面选择相应的标签。

预构建容器

选择将模型工件导入到新的预构建容器中。
选择您用于训练模型的模型框架和模型框架版本。
如果要使用 GPU 执行预测，请将加速器类型设置为 GPU。

稍后，您将在将模型部署到端点时选择 GPU 类型。
指定包含模型工件的目录的 Cloud Storage 路径。

例如：gs://BUCKET_NAME/models/。
将预测架构留空。
若要导入没有 Vertex Explainable AI 设置的模型，请点击导入。

导入完成后，您的模型将显示在模型页面上。

否则，请在可解释性标签页中输入可解释性设置，以继续配置模型。详细了解可解释性设置。

自定义容器

选择导入现有的自定义容器。
设置容器映像 URI。
如果除了容器映像之外，您还提供模型工件，请指定包含模型工件的目录的 Cloud Storage 路径。

例如 gs://BUCKET_NAME/models/。
为任何其他字段指定值。

详细了解这些可选字段。
若要导入没有 Vertex Explainable AI 设置的模型，请点击导入。

导入完成后，您的模型将显示在模型页面上。

否则，请在可解释性标签页中输入可解释性设置，以继续配置模型。详细了解可解释性设置。

AutoML 表格容器

选择导入现有的自定义容器。
在容器映像字段中，输入 MULTI_REGION-docker.pkg.dev/vertex-ai/automl-tabular/prediction-server-v1:latest。

将 MULTI_REGION 替换为 us、europe 或 asia，然后选择要从哪个 Docker 代码库拉取 Docker 映像。每个代码库都提供相同的 Docker 映像，但选择最靠近运行 Docker 的机器的 Artifact Registry 多区域可能会缩短延迟时间。
在软件包位置字段中，指定包含模型工件的目录的 Cloud Storage 路径。

路径类似于以下示例：

gs://BUCKET_NAME/models-MODEL_ID/tf-saved-model/TIMESTAMP/
将其他所有字段留空。
点击导入。

导入完成后，您的模型将显示在模型页面上。您可以像使用其他 AutoML 表格模型那样使用此模型，但导入的 AutoML 表格模型不支持 Vertex Explainable AI。

以编程方式导入模型

以下示例展示了如何使用各种工具导入模型：

gcloud

以下示例使用 gcloud ai models upload 命令：

gcloud ai models upload \
  --region=LOCATION \
  --display-name=MODEL_NAME \
  --container-image-uri=IMAGE_URI \
  --artifact-uri=PATH_TO_MODEL_ARTIFACT_DIRECTORY

替换以下内容：

LOCATION_ID：您在其中使用 Vertex AI 的区域。
MODEL_NAME：Model 的显示名称。
IMAGE_URI：用于执行预测的容器映像的 URI。例如 us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-1:latest。使用预构建容器或自定义容器。
PATH_TO_MODEL_ARTIFACT_DIRECTORY：Cloud Storage 中包含模型工件的目录的 Cloud Storage URI（以 gs:// 开头）。

上述示例展示了导入大多数模型需要的所有标志。如果您不使用预构建容器进行预测，则可能需要指定其他一些可选标志，以便 Vertex AI 能够使用容器映像。这些标志以 --container- 开头，对应于 Model 的 containerSpec 的字段。

REST

使用以下代码示例，通过 model 资源的 upload 方法上传模型。

在使用任何请求数据之前，请先进行以下替换：

LOCATION_ID：您在其中使用 Vertex AI 的区域。
PROJECT_ID：您的项目 ID。
MODEL_NAME：Model 的显示名。
MODEL_DESCRIPTION：可选。模型说明。
IMAGE_URI：用于执行预测的容器映像的 URI。例如 us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-1:latest。使用预构建容器或自定义容器。
PATH_TO_MODEL_ARTIFACT_DIRECTORY：Cloud Storage 中包含模型工件的目录的 Cloud Storage URI（以 gs:// 开头）。如果您使用的是自定义容器，此变量和 artifactUri 字段为可选字段。
labels：可选。用于组织模型的任何键值对。例如：
- "env": "prod"
- "tier": "backend"
指定要应用于此训练流水线的任何标签的 LABEL_NAME 和 LABEL_VALUE。

HTTP 方法和网址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/models:upload

请求 JSON 正文：

{
  "model": {
    "displayName": "MODEL_NAME",
    "predictSchemata": {},
    "containerSpec": {
      "imageUri": "IMAGE_URI"
    },
    "artifactUri": "PATH_TO_MODEL_ARTIFACT_DIRECTORY",
    "labels": {
      "LABEL_NAME_1": "LABEL_VALUE_1",
      "LABEL_NAME_2": "LABEL_VALUE_2"
    }
  }
}

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/models:upload"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/models:upload" | Select-Object -Expand Content

响应

{
"name": "projects/PROJECT_ID/locations/LOCATION/models/MODEL_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.UploadModelOperationMetadata",
    "genericMetadata": {
      "createTime": "2020-11-10T23:44:21.777760Z",
      "updateTime": "2020-11-10T23:44:21.777760Z"
    }
  }
}

Java

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Java 设置说明执行操作。如需了解详情，请参阅 Vertex AI Java API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。


import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.aiplatform.v1.LocationName;
import com.google.cloud.aiplatform.v1.Model;
import com.google.cloud.aiplatform.v1.ModelContainerSpec;
import com.google.cloud.aiplatform.v1.ModelServiceClient;
import com.google.cloud.aiplatform.v1.ModelServiceSettings;
import com.google.cloud.aiplatform.v1.UploadModelOperationMetadata;
import com.google.cloud.aiplatform.v1.UploadModelResponse;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class UploadModelSample {
  public static void main(String[] args)
      throws InterruptedException, ExecutionException, TimeoutException, IOException {
    // TODO(developer): Replace these variables before running the sample.
    String project = "YOUR_PROJECT_ID";
    String modelDisplayName = "YOUR_MODEL_DISPLAY_NAME";
    String metadataSchemaUri =
        "gs://google-cloud-aiplatform/schema/trainingjob/definition/custom_task_1.0.0.yaml";
    String imageUri = "YOUR_IMAGE_URI";
    String artifactUri = "gs://your-gcs-bucket/artifact_path";
    uploadModel(project, modelDisplayName, metadataSchemaUri, imageUri, artifactUri);
  }

  static void uploadModel(
      String project,
      String modelDisplayName,
      String metadataSchemaUri,
      String imageUri,
      String artifactUri)
      throws IOException, InterruptedException, ExecutionException, TimeoutException {
    ModelServiceSettings modelServiceSettings =
        ModelServiceSettings.newBuilder()
            .setEndpoint("us-central1-aiplatform.googleapis.com:443")
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ModelServiceClient modelServiceClient = ModelServiceClient.create(modelServiceSettings)) {
      String location = "us-central1";
      LocationName locationName = LocationName.of(project, location);

      ModelContainerSpec modelContainerSpec =
          ModelContainerSpec.newBuilder().setImageUri(imageUri).build();

      Model model =
          Model.newBuilder()
              .setDisplayName(modelDisplayName)
              .setMetadataSchemaUri(metadataSchemaUri)
              .setArtifactUri(artifactUri)
              .setContainerSpec(modelContainerSpec)
              .build();

      OperationFuture<UploadModelResponse, UploadModelOperationMetadata> uploadModelResponseFuture =
          modelServiceClient.uploadModelAsync(locationName, model);
      System.out.format(
          "Operation name: %s\n", uploadModelResponseFuture.getInitialFuture().get().getName());
      System.out.println("Waiting for operation to finish...");
      UploadModelResponse uploadModelResponse = uploadModelResponseFuture.get(5, TimeUnit.MINUTES);

      System.out.println("Upload Model Response");
      System.out.format("Model: %s\n", uploadModelResponse.getModel());
    }
  }
}

Node.js

在尝试此示例之前，请按照《Vertex AI 快速入门：使用客户端库》中的 Node.js 设置说明执行操作。如需了解详情，请参阅 Vertex AI Node.js API 参考文档。

如需向 Vertex AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 */

// const modelDisplayName = 'YOUR_MODEL_DISPLAY_NAME';
// const metadataSchemaUri = 'YOUR_METADATA_SCHEMA_URI';
// const imageUri = 'YOUR_IMAGE_URI';
// const artifactUri = 'YOUR_ARTIFACT_URI';
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';

// Imports the Google Cloud Model Service Client library
const {ModelServiceClient} = require('@google-cloud/aiplatform');

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};

// Instantiates a client
const modelServiceClient = new ModelServiceClient(clientOptions);

async function uploadModel() {
  // Configure the parent resources
  const parent = `projects/${project}/locations/${location}`;
  // Configure the model resources
  const model = {
    displayName: modelDisplayName,
    metadataSchemaUri: '',
    artifactUri: artifactUri,
    containerSpec: {
      imageUri: imageUri,
      command: [],
      args: [],
      env: [],
      ports: [],
      predictRoute: '',
      healthRoute: '',
    },
  };
  const request = {
    parent,
    model,
  };

  console.log('PARENT AND MODEL');
  console.log(parent, model);
  // Upload Model request
  const [response] = await modelServiceClient.uploadModel(request);
  console.log(`Long running operation : ${response.name}`);

  // Wait for operation to complete
  await response.promise();
  const result = response.result;

  console.log('Upload model response ');
  console.log(`\tModel : ${result.model}`);
}
uploadModel();

Python

如需了解如何安装或更新 Vertex AI SDK for Python，请参阅安装 Vertex AI SDK for Python。如需了解详情，请参阅 Python API 参考文档。

from typing import Dict, Optional, Sequence

from google.cloud import aiplatform
from google.cloud.aiplatform import explain


def upload_model_sample(
    project: str,
    location: str,
    display_name: str,
    serving_container_image_uri: str,
    artifact_uri: Optional[str] = None,
    serving_container_predict_route: Optional[str] = None,
    serving_container_health_route: Optional[str] = None,
    description: Optional[str] = None,
    serving_container_command: Optional[Sequence[str]] = None,
    serving_container_args: Optional[Sequence[str]] = None,
    serving_container_environment_variables: Optional[Dict[str, str]] = None,
    serving_container_ports: Optional[Sequence[int]] = None,
    instance_schema_uri: Optional[str] = None,
    parameters_schema_uri: Optional[str] = None,
    prediction_schema_uri: Optional[str] = None,
    explanation_metadata: Optional[explain.ExplanationMetadata] = None,
    explanation_parameters: Optional[explain.ExplanationParameters] = None,
    sync: bool = True,
):

    aiplatform.init(project=project, location=location)

    model = aiplatform.Model.upload(
        display_name=display_name,
        artifact_uri=artifact_uri,
        serving_container_image_uri=serving_container_image_uri,
        serving_container_predict_route=serving_container_predict_route,
        serving_container_health_route=serving_container_health_route,
        instance_schema_uri=instance_schema_uri,
        parameters_schema_uri=parameters_schema_uri,
        prediction_schema_uri=prediction_schema_uri,
        description=description,
        serving_container_command=serving_container_command,
        serving_container_args=serving_container_args,
        serving_container_environment_variables=serving_container_environment_variables,
        serving_container_ports=serving_container_ports,
        explanation_metadata=explanation_metadata,
        explanation_parameters=explanation_parameters,
        sync=sync,
    )

    model.wait()

    print(model.display_name)
    print(model.resource_name)
    return model

如需导入启用了 Vertex Explainable AI 设置的模型，请参阅 Vertex Explainable AI 模型导入示例。

获取操作状态

某些请求会启动需要一些时间才能完成的长时间运行的操作。这些请求会返回操作名称，您可以使用该名称查看操作状态或取消操作。Vertex AI 提供辅助方法来调用长时间运行的操作。如需了解详情，请参阅使用长时间运行的操作。

限制

支持的模型大小上限为 10 GiB。

后续步骤

以编程方式或使用 Google Cloud 控制台将模型部署到端点。