AutoML Natural Language API 教程

本教程演示了如何使用 AutoML Natural Language 创建对内容进行分类的自定义模型。该应用程序使用来自 Kaggle 开源数据集 HappyDB 的众包“快乐时刻”语料库训练自定义模型。由此产生的模型将幸福时刻分成反映幸福原因的不同类别。

该数据通过知识共享 CCO:公共领域许可提供。

本教程介绍如何训练自定义模型、评估其性能以及对新内容进行分类。

前提条件

配置项目环境

  1. 登录您的 Google 帐号。

    如果您还没有 Google 帐号,请注册新帐号

  2. 在 Google Cloud Console 的项目选择器页面上,选择或创建一个 Google Cloud 项目。

    转到项目选择器页面

  3. 确保您的 Cloud 项目已启用结算功能。 了解如何确认您的项目是否已启用结算功能

  4. 启用 AutoML Natural Language API。

    启用 API

  5. 安装 gcloud 命令行工具
  6. 按照说明创建服务帐号并下载密钥文件
  7. GOOGLE_APPLICATION_CREDENTIALS 环境变量设置为如下路径:指向您在创建服务帐号时所下载的服务帐号密钥文件的路径。例如:
         export GOOGLE_APPLICATION_CREDENTIALS=key-file
  8. 使用以下命令将您的新服务帐号添加到 IAM 角色 AutoML 编辑者。将 project-id 替换为您的 GCP 项目的名称,并将 service-account-name 替换为新服务帐号的名称,例如 service-account1@myproject.iam.gserviceaccount.com
         gcloud auth login
         gcloud config set project project-id
         gcloud projects add-iam-policy-binding project-id 
    --member=serviceAccount:service-account-name
    --role='roles/automl.editor'
  9. 允许 AutoML Natural Language 服务帐号访问您的 Google Cloud 项目资源:
    gcloud projects add-iam-policy-binding project-id 
    --member="serviceAccount:custom-vision@appspot.gserviceaccount.com"
    --role="roles/storage.admin"
  10. 安装客户端库
  11. 设置 PROJECT_ID 和 REGION_NAME 环境变量。

    project-id 替换为您的 Google Cloud Platform 项目 ID。 AutoML Natural Language 目前要求将位置设为 us-central1
         export PROJECT_ID="project-id"
         export REGION_NAME="us-central1"
         
  12. 创建一个 Google Cloud Storage 存储分区以存储您将用于训练自定义模型的文档。

    存储分区名称必须采用以下格式: $PROJECT_ID-lcm. 以下命令将在 us-central1 区域中创建一个名为 $PROJECT_ID-lcm 的存储分区。
    gsutil mb -p $PROJECT_ID -c regional -l $REGION_NAME gs://$PROJECT_ID-lcm/
  13. happiness.csv 文件从公共存储分区复制到您的 Google Cloud Storage 存储分区。

    happiness.csv 文件位于公共存储分区 cloud-ml-dataNL-classification 文件夹中。

源代码文件位置

如果您需要源代码,可以在此处找到。您可以随时将源代码文件复制到您的 Google Cloud Platform 项目文件夹中。否则,我们建议您在执行每个步骤时直接从本页面复制代码。

Python

本教程包含以下 Python 程序

  • language_text_classification_create_dataset.py - 包含创建数据集的功能
  • import_dataset.py - 包含导入数据集的功能
  • language_text_classification_create_model.py - 包含创建模型的功能
  • list_model_evaluations.py - 包含列出模型评估的功能
  • language_text_classification_predict.py - 包含与预测相关的功能
  • delete_model.py - 包含删除模型的功能

Java

本教程包含以下 Java 文件

  • LanguageTextClassificationCreateDataset.java - 包含创建数据集的功能
  • ImportDataset.java - 包含导入数据集的功能
  • LanguageTextClassificationCreateModel.java - 包含创建模型的功能
  • ListModelEvaluations.java - 包含列出模型评估的功能
  • LanguageTextClassificationPredict.java - 包含与预测相关的功能
  • DeleteModel.java - 包含删除模型的功能

Node.js

本教程包含以下 Node.js 程序

  • language_text_classification_create_dataset.js - 包含创建数据集的功能
  • import_dataset.js - 包含导入数据集的功能
  • language_text_classification_create_model.js - 包含创建模型的功能
  • list_model_evaluations.js - 包含列出模型评估的功能
  • language_text_classification_predict.js - 包含与预测相关的功能
  • delete_model.js - 包含删除模型的功能

运行应用

第 1 步:创建数据集

如要创建自定义模型,首先需要创建一个空数据集,该数据集最终用于保存模型的训练数据。创建数据集时,您可以指定希望自定义模型执行的分类类型:

  • 多类别 (MULTICLASS) 分类为每个已分类文档分配一个标签
  • 多标签 (MULTILABEL) 允许为一个文档分配多个标签

本教程会创建一个名为“happydb”的数据集并使用多类别分类。

复制代码

Python

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# display_name = "YOUR_DATASET_NAME"

client = automl.AutoMlClient()

# A resource that represents Google Cloud Platform location.
project_location = f"projects/{project_id}/locations/us-central1"
# Specify the classification type
# Types:
# MultiLabel: Multiple labels are allowed for one example.
# MultiClass: At most one label is allowed per example.
metadata = automl.TextClassificationDatasetMetadata(
    classification_type=automl.ClassificationType.MULTICLASS
)
dataset = automl.Dataset(
    display_name=display_name,
    text_classification_dataset_metadata=metadata,
)

# Create a dataset with the dataset metadata in the region.
response = client.create_dataset(parent=project_location, dataset=dataset)

created_dataset = response.result()

# Display the dataset information
print("Dataset name: {}".format(created_dataset.name))
print("Dataset id: {}".format(created_dataset.name.split("/")[-1]))

Java

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.ClassificationType;
import com.google.cloud.automl.v1.Dataset;
import com.google.cloud.automl.v1.LocationName;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.cloud.automl.v1.TextClassificationDatasetMetadata;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class LanguageTextClassificationCreateDataset {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String displayName = "YOUR_DATASET_NAME";
    createDataset(projectId, displayName);
  }

  // Create a dataset
  static void createDataset(String projectId, String displayName)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // A resource that represents Google Cloud Platform location.
      LocationName projectLocation = LocationName.of(projectId, "us-central1");

      // Specify the classification type
      // Types:
      // MultiLabel: Multiple labels are allowed for one example.
      // MultiClass: At most one label is allowed per example.
      ClassificationType classificationType = ClassificationType.MULTILABEL;

      // Specify the text classification type for the dataset.
      TextClassificationDatasetMetadata metadata =
          TextClassificationDatasetMetadata.newBuilder()
              .setClassificationType(classificationType)
              .build();
      Dataset dataset =
          Dataset.newBuilder()
              .setDisplayName(displayName)
              .setTextClassificationDatasetMetadata(metadata)
              .build();
      OperationFuture<Dataset, OperationMetadata> future =
          client.createDatasetAsync(projectLocation, dataset);

      Dataset createdDataset = future.get();

      // Display the dataset information.
      System.out.format("Dataset name: %s\n", createdDataset.getName());
      // To get the dataset id, you have to parse it out of the `name` field. As dataset Ids are
      // required for other methods.
      // Name Form: `projects/{project_id}/locations/{location_id}/datasets/{dataset_id}`
      String[] names = createdDataset.getName().split("/");
      String datasetId = names[names.length - 1];
      System.out.format("Dataset id: %s\n", datasetId);
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const displayName = 'YOUR_DISPLAY_NAME';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function createDataset() {
  // Construct request
  const request = {
    parent: client.locationPath(projectId, location),
    dataset: {
      displayName: displayName,
      textClassificationDatasetMetadata: {
        classificationType: 'MULTICLASS',
      },
    },
  };

  // Create dataset
  const [operation] = await client.createDataset(request);

  // Wait for operation to complete.
  const [response] = await operation.promise();

  console.log(`Dataset name: ${response.name}`);
  console.log(`
    Dataset id: ${
      response.name
        .split('/')
        [response.name.split('/').length - 1].split('\n')[0]
    }`);
}

createDataset();

请求

运行 create_dataset 函数以创建空数据集。您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • 为数据集 (happydb) 设置 display_name

Python

python language_text_classification_create_dataset.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.LanguageTextClassificationCreateDataset"

Node.js

node language_text_classification_create_dataset.js

响应

响应中包含新创建的数据集的详细信息,其中包括数据集 ID(用于在以后的请求中引用该数据集)。我们建议您将环境变量 DATASET_ID 设置为返回的数据集 ID 值。

Dataset name: projects/216065747626/locations/us-central1/datasets/TCN7372141011130533778
Dataset id: TCN7372141011130533778
Dataset display name: happydb
Text classification dataset specification:
       classification_type: MULTICLASS
Dataset example count: 0
Dataset create time:
       seconds: 1530251987
       nanos: 216586000

第 2 步:将训练项导入数据集

接下来通过使用目标类别标记的训练内容项列表填充数据集。

import_dataset 函数接口接受 .csv 文件作为输入,该文件列出了所有训练文档的位置以及每个训练文档的正确标签。(如需详细了解所需格式,请参阅准备训练数据。)在本教程中,我们将使用上一步中您上传到 Google Cloud Storage 的 happiness.csv

复制代码

Python

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# dataset_id = "YOUR_DATASET_ID"
# path = "gs://YOUR_BUCKET_ID/path/to/data.csv"

client = automl.AutoMlClient()
# Get the full path of the dataset.
dataset_full_id = client.dataset_path(
    project_id, "us-central1", dataset_id
)
# Get the multiple Google Cloud Storage URIs
input_uris = path.split(",")
gcs_source = automl.GcsSource(input_uris=input_uris)
input_config = automl.InputConfig(gcs_source=gcs_source)
# Import data from the input URI
response = client.import_data(name=dataset_full_id, input_config=input_config)

print("Processing import...")
print("Data imported. {}".format(response.result()))

Java

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.DatasetName;
import com.google.cloud.automl.v1.GcsSource;
import com.google.cloud.automl.v1.InputConfig;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.protobuf.Empty;
import java.io.IOException;
import java.util.Arrays;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

class ImportDataset {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String datasetId = "YOUR_DATASET_ID";
    String path = "gs://BUCKET_ID/path_to_training_data.csv";
    importDataset(projectId, datasetId, path);
  }

  // Import a dataset
  static void importDataset(String projectId, String datasetId, String path)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // Get the complete path of the dataset.
      DatasetName datasetFullId = DatasetName.of(projectId, "us-central1", datasetId);

      // Get multiple Google Cloud Storage URIs to import data from
      GcsSource gcsSource =
          GcsSource.newBuilder().addAllInputUris(Arrays.asList(path.split(","))).build();

      // Import data from the input URI
      InputConfig inputConfig = InputConfig.newBuilder().setGcsSource(gcsSource).build();
      System.out.println("Processing import...");

      // Start the import job
      OperationFuture<Empty, OperationMetadata> operation =
          client.importDataAsync(datasetFullId, inputConfig);

      System.out.format("Operation name: %s%n", operation.getName());

      // If you want to wait for the operation to finish, adjust the timeout appropriately. The
      // operation will still run if you choose not to wait for it to complete. You can check the
      // status of your operation using the operation's name.
      Empty response = operation.get(45, TimeUnit.MINUTES);
      System.out.format("Dataset imported. %s%n", response);
    } catch (TimeoutException e) {
      System.out.println("The operation's polling period was not long enough.");
      System.out.println("You can use the Operation's name to get the current status.");
      System.out.println("The import job is still running and will complete as expected.");
      throw e;
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const datasetId = 'YOUR_DISPLAY_ID';
// const path = 'gs://BUCKET_ID/path_to_training_data.csv';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function importDataset() {
  // Construct request
  const request = {
    name: client.datasetPath(projectId, location, datasetId),
    inputConfig: {
      gcsSource: {
        inputUris: path.split(','),
      },
    },
  };

  // Import dataset
  console.log('Proccessing import');
  const [operation] = await client.importData(request);

  // Wait for operation to complete.
  const [response] = await operation.promise();
  console.log(`Dataset imported: ${response}`);
}

importDataset();

请求

运行 import_data 函数以导入训练内容。需要更改的第一部分代码是上一步中的数据集 ID,第二部分代码是 happiness.csv 的 URI。您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • 为数据集设置 dataset_id(来自上一步的输出)
  • 设置 path,这是 gs://YOUR_PROJECT_ID-lcm/csv/happiness.csv 的 URI

Python

python import_dataset.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.ImportDataset"

Node.js

node import_dataset.js

响应

Processing import...
Dataset imported.

第 3 步:创建(训练)模型

拥有已导入带标签训练文档的数据集后,您就可以训练新模型了。

复制代码

Python

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# dataset_id = "YOUR_DATASET_ID"
# display_name = "YOUR_MODEL_NAME"

client = automl.AutoMlClient()

# A resource that represents Google Cloud Platform location.
project_location = f"projects/{project_id}/locations/us-central1"
# Leave model unset to use the default base model provided by Google
metadata = automl.TextClassificationModelMetadata()
model = automl.Model(
    display_name=display_name,
    dataset_id=dataset_id,
    text_classification_model_metadata=metadata,
)

# Create a model with the model metadata in the region.
response = client.create_model(parent=project_location, model=model)

print(u"Training operation name: {}".format(response.operation.name))
print("Training started...")

Java

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.LocationName;
import com.google.cloud.automl.v1.Model;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.cloud.automl.v1.TextClassificationModelMetadata;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class LanguageTextClassificationCreateModel {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String datasetId = "YOUR_DATASET_ID";
    String displayName = "YOUR_DATASET_NAME";
    createModel(projectId, datasetId, displayName);
  }

  // Create a model
  static void createModel(String projectId, String datasetId, String displayName)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // A resource that represents Google Cloud Platform location.
      LocationName projectLocation = LocationName.of(projectId, "us-central1");
      // Set model metadata.
      TextClassificationModelMetadata metadata =
          TextClassificationModelMetadata.newBuilder().build();
      Model model =
          Model.newBuilder()
              .setDisplayName(displayName)
              .setDatasetId(datasetId)
              .setTextClassificationModelMetadata(metadata)
              .build();

      // Create a model with the model metadata in the region.
      OperationFuture<Model, OperationMetadata> future =
          client.createModelAsync(projectLocation, model);
      // OperationFuture.get() will block until the model is created, which may take several hours.
      // You can use OperationFuture.getInitialFuture to get a future representing the initial
      // response to the request, which contains information while the operation is in progress.
      System.out.format("Training operation name: %s\n", future.getInitialFuture().get().getName());
      System.out.println("Training started...");
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const dataset_id = 'YOUR_DATASET_ID';
// const displayName = 'YOUR_DISPLAY_NAME';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function createModel() {
  // Construct request
  const request = {
    parent: client.locationPath(projectId, location),
    model: {
      displayName: displayName,
      datasetId: datasetId,
      textClassificationModelMetadata: {}, // Leave unset, to use the default base model
    },
  };

  // Don't wait for the LRO
  const [operation] = await client.createModel(request);
  console.log(`Training started... ${operation}`);
  console.log(`Training operation name: ${operation.name}`);
}

createModel();

请求

调用 create_model 函数以创建模型。数据集 ID 来自之前的步骤。您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • 为数据集设置 dataset_id(来自上一步的输出)
  • 为您的模型设置 display_name (happydb_model)

Python

python language_text_classification_create_model.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.LanguageTextClassificationCreateModel"

Node.js

node language_text_classification_create_model.js

响应

create_model 函数开始训练操作并输出操作名称。训练是异步进行的,可能需要一段时间才能完成,因此您可以使用操作 ID 来检查训练状态。训练完成后,create_model 会返回模型 ID。与数据集 ID 一样,我们建议您将环境变量 MODEL_ID 设置为返回的模型 ID 值。

Training operation name: projects/216065747626/locations/us-central1/operations/TCN3007727620979824033
Training started...
Model name: projects/216065747626/locations/us-central1/models/TCN7683346839371803263
Model id: TCN7683346839371803263
Model display name: happydb_model
Model create time:
        seconds: 1529649600
        nanos: 966000000
Model deployment state: deployed

第 4 步:评估模型

训练结束后,您可以通过查看模型的精确率召回率F1 得分来评估其就绪情况。

display_evaluation 函数接受模型 ID 作为参数。

复制代码

Python

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# model_id = "YOUR_MODEL_ID"

client = automl.AutoMlClient()
# Get the full path of the model.
model_full_id = client.model_path(project_id, "us-central1", model_id)

print("List of model evaluations:")
for evaluation in client.list_model_evaluations(parent=model_full_id, filter=""):
    print("Model evaluation name: {}".format(evaluation.name))
    print(
        "Model annotation spec id: {}".format(
            evaluation.annotation_spec_id
        )
    )
    print("Create Time: {}".format(evaluation.create_time))
    print(
        "Evaluation example count: {}".format(
            evaluation.evaluated_example_count
        )
    )
    print(
        "Classification model evaluation metrics: {}".format(
            evaluation.classification_evaluation_metrics
        )
    )

Java


import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.ListModelEvaluationsRequest;
import com.google.cloud.automl.v1.ModelEvaluation;
import com.google.cloud.automl.v1.ModelName;
import java.io.IOException;

class ListModelEvaluations {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String modelId = "YOUR_MODEL_ID";
    listModelEvaluations(projectId, modelId);
  }

  // List model evaluations
  static void listModelEvaluations(String projectId, String modelId) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // Get the full path of the model.
      ModelName modelFullId = ModelName.of(projectId, "us-central1", modelId);
      ListModelEvaluationsRequest modelEvaluationsrequest =
          ListModelEvaluationsRequest.newBuilder().setParent(modelFullId.toString()).build();

      // List all the model evaluations in the model by applying filter.
      System.out.println("List of model evaluations:");
      for (ModelEvaluation modelEvaluation :
          client.listModelEvaluations(modelEvaluationsrequest).iterateAll()) {

        System.out.format("Model Evaluation Name: %s\n", modelEvaluation.getName());
        System.out.format("Model Annotation Spec Id: %s", modelEvaluation.getAnnotationSpecId());
        System.out.println("Create Time:");
        System.out.format("\tseconds: %s\n", modelEvaluation.getCreateTime().getSeconds());
        System.out.format("\tnanos: %s", modelEvaluation.getCreateTime().getNanos() / 1e9);
        System.out.format(
            "Evalution Example Count: %d\n", modelEvaluation.getEvaluatedExampleCount());
        System.out.format(
            "Classification Model Evaluation Metrics: %s\n",
            modelEvaluation.getClassificationEvaluationMetrics());
      }
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const modelId = 'YOUR_MODEL_ID';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function listModelEvaluations() {
  // Construct request
  const request = {
    parent: client.modelPath(projectId, location, modelId),
    filter: '',
  };

  const [response] = await client.listModelEvaluations(request);

  console.log('List of model evaluations:');
  for (const evaluation of response) {
    console.log(`Model evaluation name: ${evaluation.name}`);
    console.log(`Model annotation spec id: ${evaluation.annotationSpecId}`);
    console.log(`Model display name: ${evaluation.displayName}`);
    console.log('Model create time');
    console.log(`\tseconds ${evaluation.createTime.seconds}`);
    console.log(`\tnanos ${evaluation.createTime.nanos / 1e9}`);
    console.log(
      `Evaluation example count: ${evaluation.evaluatedExampleCount}`
    );
    console.log(
      `Classification model evaluation metrics: ${evaluation.classificationEvaluationMetrics}`
    );
  }
}

listModelEvaluations();

请求

通过执行以下请求来发出显示模型整体评估性能的请求。您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • model_id 设置为您的模型 ID

Python

python list_model_evaluations.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.ListModelEvaluations"

Node.js

node list_model_evaluations.js

响应

如果精确率和召回率分数过低,您可以强化训练数据集并重新训练模型。如需了解详情,请参阅评估模型

Precision and recall are based on a score threshold of 0.5
Model Precision: 96.3%
Model Recall: 95.7%
Model F1 score: 96.0%
Model Precision@1: 96.33%
Model Recall@1: 95.74%
Model F1 score@1: 96.04%

第 5 步:部署模型

当自定义模型达到您的质量标准时,就可以部署模型然后发出预测请求。

复制代码

Python

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# model_id = "YOUR_MODEL_ID"

client = automl.AutoMlClient()
# Get the full path of the model.
model_full_id = client.model_path(project_id, "us-central1", model_id)
response = client.deploy_model(name=model_full_id)

print(f"Model deployment finished. {response.result()}")

Java

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.DeployModelRequest;
import com.google.cloud.automl.v1.ModelName;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.protobuf.Empty;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class DeployModel {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String modelId = "YOUR_MODEL_ID";
    deployModel(projectId, modelId);
  }

  // Deploy a model for prediction
  static void deployModel(String projectId, String modelId)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // Get the full path of the model.
      ModelName modelFullId = ModelName.of(projectId, "us-central1", modelId);
      DeployModelRequest request =
          DeployModelRequest.newBuilder().setName(modelFullId.toString()).build();
      OperationFuture<Empty, OperationMetadata> future = client.deployModelAsync(request);

      future.get();
      System.out.println("Model deployment finished");
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const modelId = 'YOUR_MODEL_ID';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function deployModel() {
  // Construct request
  const request = {
    name: client.modelPath(projectId, location, modelId),
  };

  const [operation] = await client.deployModel(request);

  // Wait for operation to complete.
  const [response] = await operation.promise();
  console.log(`Model deployment finished. ${response}`);
}

deployModel();

请求

对于 deploy_model 函数,您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • model_id 设置为您的模型 ID

Python

python deploy_model.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.DeployModel.java"

Node.js

node deploy_model.js

响应

Model deployment finished.

第 6 步:使用模型进行预测

部署模型后,您就可以使用它来对新内容进行分类了。

复制代码

Python

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# model_id = "YOUR_MODEL_ID"
# content = "text to predict"

prediction_client = automl.PredictionServiceClient()

# Get the full path of the model.
model_full_id = automl.AutoMlClient.model_path(
    project_id, "us-central1", model_id
)

# Supported mime_types: 'text/plain', 'text/html'
# https://cloud.google.com/automl/docs/reference/rpc/google.cloud.automl.v1#textsnippet
text_snippet = automl.TextSnippet(
    content=content, mime_type="text/plain"
)
payload = automl.ExamplePayload(text_snippet=text_snippet)

response = prediction_client.predict(name=model_full_id, payload=payload)

for annotation_payload in response.payload:
    print(
        u"Predicted class name: {}".format(annotation_payload.display_name)
    )
    print(
        u"Predicted class score: {}".format(
            annotation_payload.classification.score
        )
    )

Java

import com.google.cloud.automl.v1.AnnotationPayload;
import com.google.cloud.automl.v1.ExamplePayload;
import com.google.cloud.automl.v1.ModelName;
import com.google.cloud.automl.v1.PredictRequest;
import com.google.cloud.automl.v1.PredictResponse;
import com.google.cloud.automl.v1.PredictionServiceClient;
import com.google.cloud.automl.v1.TextSnippet;
import java.io.IOException;

class LanguageTextClassificationPredict {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String modelId = "YOUR_MODEL_ID";
    String content = "text to predict";
    predict(projectId, modelId, content);
  }

  static void predict(String projectId, String modelId, String content) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServiceClient client = PredictionServiceClient.create()) {
      // Get the full path of the model.
      ModelName name = ModelName.of(projectId, "us-central1", modelId);

      // For available mime types, see:
      // https://cloud.google.com/automl/docs/reference/rest/v1/projects.locations.models/predict#textsnippet
      TextSnippet textSnippet =
          TextSnippet.newBuilder()
              .setContent(content)
              .setMimeType("text/plain") // Types: text/plain, text/html
              .build();
      ExamplePayload payload = ExamplePayload.newBuilder().setTextSnippet(textSnippet).build();
      PredictRequest predictRequest =
          PredictRequest.newBuilder().setName(name.toString()).setPayload(payload).build();

      PredictResponse response = client.predict(predictRequest);

      for (AnnotationPayload annotationPayload : response.getPayloadList()) {
        System.out.format("Predicted class name: %s\n", annotationPayload.getDisplayName());
        System.out.format(
            "Predicted sentiment score: %.2f\n\n",
            annotationPayload.getClassification().getScore());
      }
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const modelId = 'YOUR_MODEL_ID';
// const content = 'text to predict'

// Imports the Google Cloud AutoML library
const {PredictionServiceClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new PredictionServiceClient();

async function predict() {
  // Construct request
  const request = {
    name: client.modelPath(projectId, location, modelId),
    payload: {
      textSnippet: {
        content: content,
        mimeType: 'text/plain', // Types: 'test/plain', 'text/html'
      },
    },
  };

  const [response] = await client.predict(request);

  for (const annotationPayload of response.payload) {
    console.log(`Predicted class name: ${annotationPayload.displayName}`);
    console.log(
      `Predicted class score: ${annotationPayload.classification.score}`
    );
  }
}

predict();

请求

对于 predict 函数,您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • model_id 设置为您的模型 ID
  • 设置您要预测的 content

Python

python language_text_classification_predict.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.LanguageTextClassificationPredict"

Node.js

node language_text_classification_predict.js

响应

该函数返回分类得分,指示内容与各类别的匹配程度。

Prediction results:
Predicted class name: affection
Predicted class score: 0.9702693223953247

第 7 步:删除模型

使用完此示例模型后,您可以永久删除该模型,但之后您将无法再使用该模型进行预测。

复制代码

Python

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# model_id = "YOUR_MODEL_ID"

client = automl.AutoMlClient()
# Get the full path of the model.
model_full_id = client.model_path(project_id, "us-central1", model_id)
response = client.delete_model(name=model_full_id)

print("Model deleted. {}".format(response.result()))

Java

import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.ModelName;
import com.google.protobuf.Empty;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class DeleteModel {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String modelId = "YOUR_MODEL_ID";
    deleteModel(projectId, modelId);
  }

  // Delete a model
  static void deleteModel(String projectId, String modelId)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // Get the full path of the model.
      ModelName modelFullId = ModelName.of(projectId, "us-central1", modelId);

      // Delete a model.
      Empty response = client.deleteModelAsync(modelFullId).get();

      System.out.println("Model deletion started...");
      System.out.println(String.format("Model deleted. %s", response));
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const modelId = 'YOUR_MODEL_ID';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function deleteModel() {
  // Construct request
  const request = {
    name: client.modelPath(projectId, location, modelId),
  };

  const [response] = await client.deleteModel(request);
  console.log(`Model deleted: ${response}`);
}

deleteModel();

请求

使用操作类型 delete_model 发送请求以删除您创建的模型。您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • model_id 设置为您的模型 ID

Python

python delete_model.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.DeleteModel"

Node.js

node delete_model.js

响应

Model deleted.