AutoML Natural Language API 教程

本教程演示了如何使用 AutoML Natural Language 创建对内容进行分类的自定义模型。该应用程序使用来自 Kaggle 开源数据集 HappyDB 的众包“快乐时刻”语料库训练自定义模型。由此产生的模型将幸福时刻分成反映幸福原因的不同类别。

该数据通过知识共享 CCO:公共领域许可提供。

本教程介绍如何训练自定义模型、评估其性能以及对新内容进行分类。

前提条件

配置项目环境

  1. 登录您的 Google Cloud 账号。如果您是 Google Cloud 新手,请创建一个账号来评估我们的产品在实际场景中的表现。新客户还可获享 $300 赠金,用于运行、测试和部署工作负载。
  2. 在 Google Cloud Console 中的项目选择器页面上,选择或创建一个 Google Cloud 项目

    转到“项目选择器”

  3. 确保您的 Google Cloud 项目已启用结算功能

  4. 启用 AutoML Natural Language API。

    启用 API

  5. 在 Google Cloud Console 中的项目选择器页面上,选择或创建一个 Google Cloud 项目

    转到“项目选择器”

  6. 确保您的 Google Cloud 项目已启用结算功能

  7. 启用 AutoML Natural Language API。

    启用 API

  8. 安装 Google Cloud CLI
  9. 按照说明创建服务账号并下载密钥文件
  10. GOOGLE_APPLICATION_CREDENTIALS 环境变量设置为如下路径:指向您在创建服务账号时所下载的服务账号密钥文件的路径。例如:
         export GOOGLE_APPLICATION_CREDENTIALS=key-file
  11. 使用以下命令将您的新服务账号添加到 IAM 角色 AutoML 编辑者。将 project-id 替换为 Google Cloud 项目的名称,并将 service-account-name 替换为新服务账号的名称,例如 service-account1@myproject.iam.gserviceaccount.com
         gcloud auth login
         gcloud config set project project-id
         gcloud projects add-iam-policy-binding project-id 
    --member=serviceAccount:service-account-name
    --role='roles/automl.editor'
  12. 允许 AutoML Natural Language 服务账号访问您的 Google Cloud 项目资源:
    gcloud projects add-iam-policy-binding project-id 
    --member="serviceAccount:custom-vision@appspot.gserviceaccount.com"
    --role="roles/storage.admin"
  13. 安装客户端库
  14. 设置 PROJECT_ID 和 REGION_NAME 环境变量。

    project-id 替换为您的 Google Cloud Platform 项目 ID。 AutoML Natural Language 目前要求将位置设为 us-central1
         export PROJECT_ID="project-id"
         export REGION_NAME="us-central1"
         
  15. 创建一个 Google Cloud Storage 存储桶以存储您将用于训练自定义模型的文档。

    存储桶名称必须采用以下格式: $PROJECT_ID-lcm. 以下命令将在 us-central1 区域中创建一个名为 $PROJECT_ID-lcm 的存储桶。
    gsutil mb -p $PROJECT_ID -c regional -l $REGION_NAME gs://$PROJECT_ID-lcm/
  16. happiness.csv 文件从公共存储桶复制到您的 Google Cloud Storage 存储桶。

    happiness.csv 文件位于公共存储桶 cloud-ml-dataNL-classification 文件夹中。

源代码文件位置

如果您需要源代码,可以在此处找到。您可以随时将源代码文件复制到您的 Google Cloud Platform 项目文件夹中。否则,我们建议您在执行每个步骤时直接从本页面复制代码。

Python

本教程包含以下 Python 程序

  • language_text_classification_create_dataset.py - 包含创建数据集的功能
  • import_dataset.py - 包含导入数据集的功能
  • language_text_classification_create_model.py - 包含创建模型的功能
  • list_model_evaluations.py - 包含列出模型评估的功能
  • language_text_classification_predict.py - 包含与预测相关的功能
  • delete_model.py - 包含删除模型的功能

Java

本教程包含以下 Java 文件

  • LanguageTextClassificationCreateDataset.java - 包含创建数据集的功能
  • ImportDataset.java - 包含导入数据集的功能
  • LanguageTextClassificationCreateModel.java - 包含创建模型的功能
  • ListModelEvaluations.java - 包含列出模型评估的功能
  • LanguageTextClassificationPredict.java - 包含与预测相关的功能
  • DeleteModel.java - 包含删除模型的功能

Node.js

本教程包含以下 Node.js 程序

  • language_text_classification_create_dataset.js - 包含创建数据集的功能
  • import_dataset.js - 包含导入数据集的功能
  • language_text_classification_create_model.js - 包含创建模型的功能
  • list_model_evaluations.js - 包含列出模型评估的功能
  • language_text_classification_predict.js - 包含与预测相关的功能
  • delete_model.js - 包含删除模型的功能

运行应用

第 1 步:创建数据集

如要创建自定义模型,首先需要创建一个空数据集,该数据集最终用于保存模型的训练数据。创建数据集时,您可以指定希望自定义模型执行的分类类型:

  • 多类别 (MULTICLASS) 分类为每个已分类文档分配一个标签
  • 多标签 (MULTILABEL) 允许为一个文档分配多个标签

本教程会创建一个名为“happydb”的数据集并使用多类别分类。

复制代码

Python

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Python API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# display_name = "YOUR_DATASET_NAME"

client = automl.AutoMlClient()

# A resource that represents Google Cloud Platform location.
project_location = f"projects/{project_id}/locations/us-central1"
# Specify the classification type
# Types:
# MultiLabel: Multiple labels are allowed for one example.
# MultiClass: At most one label is allowed per example.
metadata = automl.TextClassificationDatasetMetadata(
    classification_type=automl.ClassificationType.MULTICLASS
)
dataset = automl.Dataset(
    display_name=display_name,
    text_classification_dataset_metadata=metadata,
)

# Create a dataset with the dataset metadata in the region.
response = client.create_dataset(parent=project_location, dataset=dataset)

created_dataset = response.result()

# Display the dataset information
print(f"Dataset name: {created_dataset.name}")
print("Dataset id: {}".format(created_dataset.name.split("/")[-1]))

Java

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Java API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.ClassificationType;
import com.google.cloud.automl.v1.Dataset;
import com.google.cloud.automl.v1.LocationName;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.cloud.automl.v1.TextClassificationDatasetMetadata;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class LanguageTextClassificationCreateDataset {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String displayName = "YOUR_DATASET_NAME";
    createDataset(projectId, displayName);
  }

  // Create a dataset
  static void createDataset(String projectId, String displayName)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // A resource that represents Google Cloud Platform location.
      LocationName projectLocation = LocationName.of(projectId, "us-central1");

      // Specify the classification type
      // Types:
      // MultiLabel: Multiple labels are allowed for one example.
      // MultiClass: At most one label is allowed per example.
      ClassificationType classificationType = ClassificationType.MULTILABEL;

      // Specify the text classification type for the dataset.
      TextClassificationDatasetMetadata metadata =
          TextClassificationDatasetMetadata.newBuilder()
              .setClassificationType(classificationType)
              .build();
      Dataset dataset =
          Dataset.newBuilder()
              .setDisplayName(displayName)
              .setTextClassificationDatasetMetadata(metadata)
              .build();
      OperationFuture<Dataset, OperationMetadata> future =
          client.createDatasetAsync(projectLocation, dataset);

      Dataset createdDataset = future.get();

      // Display the dataset information.
      System.out.format("Dataset name: %s\n", createdDataset.getName());
      // To get the dataset id, you have to parse it out of the `name` field. As dataset Ids are
      // required for other methods.
      // Name Form: `projects/{project_id}/locations/{location_id}/datasets/{dataset_id}`
      String[] names = createdDataset.getName().split("/");
      String datasetId = names[names.length - 1];
      System.out.format("Dataset id: %s\n", datasetId);
    }
  }
}

Node.js

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Node.js API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const displayName = 'YOUR_DISPLAY_NAME';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function createDataset() {
  // Construct request
  const request = {
    parent: client.locationPath(projectId, location),
    dataset: {
      displayName: displayName,
      textClassificationDatasetMetadata: {
        classificationType: 'MULTICLASS',
      },
    },
  };

  // Create dataset
  const [operation] = await client.createDataset(request);

  // Wait for operation to complete.
  const [response] = await operation.promise();

  console.log(`Dataset name: ${response.name}`);
  console.log(`
    Dataset id: ${
      response.name
        .split('/')
        [response.name.split('/').length - 1].split('\n')[0]
    }`);
}

createDataset();

请求

运行 create_dataset 函数以创建空数据集。您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • 为数据集 (happydb) 设置 display_name

Python

python language_text_classification_create_dataset.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.LanguageTextClassificationCreateDataset"

Node.js

node language_text_classification_create_dataset.js

响应

响应中包含新创建的数据集的详细信息,其中包括数据集 ID(用于在以后的请求中引用该数据集)。我们建议您将环境变量 DATASET_ID 设置为返回的数据集 ID 值。

Dataset name: projects/216065747626/locations/us-central1/datasets/TCN7372141011130533778
Dataset id: TCN7372141011130533778
Dataset display name: happydb
Text classification dataset specification:
       classification_type: MULTICLASS
Dataset example count: 0
Dataset create time:
       seconds: 1530251987
       nanos: 216586000

第 2 步:将训练项导入数据集

接下来通过使用目标类别标记的训练内容项列表填充数据集。

import_dataset 函数接口接受 .csv 文件作为输入,该文件列出了所有训练文档的位置以及每个训练文档的正确标签。(如需详细了解所需格式,请参阅准备训练数据。)在本教程中,我们将使用上一步中您上传到 Google Cloud Storage 的 happiness.csv

复制代码

Python

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Python API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# dataset_id = "YOUR_DATASET_ID"
# path = "gs://YOUR_BUCKET_ID/path/to/data.csv"

client = automl.AutoMlClient()
# Get the full path of the dataset.
dataset_full_id = client.dataset_path(project_id, "us-central1", dataset_id)
# Get the multiple Google Cloud Storage URIs
input_uris = path.split(",")
gcs_source = automl.GcsSource(input_uris=input_uris)
input_config = automl.InputConfig(gcs_source=gcs_source)
# Import data from the input URI
response = client.import_data(name=dataset_full_id, input_config=input_config)

print("Processing import...")
print(f"Data imported. {response.result()}")

Java

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Java API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.DatasetName;
import com.google.cloud.automl.v1.GcsSource;
import com.google.cloud.automl.v1.InputConfig;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.protobuf.Empty;
import java.io.IOException;
import java.util.Arrays;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

class ImportDataset {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String datasetId = "YOUR_DATASET_ID";
    String path = "gs://BUCKET_ID/path_to_training_data.csv";
    importDataset(projectId, datasetId, path);
  }

  // Import a dataset
  static void importDataset(String projectId, String datasetId, String path)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // Get the complete path of the dataset.
      DatasetName datasetFullId = DatasetName.of(projectId, "us-central1", datasetId);

      // Get multiple Google Cloud Storage URIs to import data from
      GcsSource gcsSource =
          GcsSource.newBuilder().addAllInputUris(Arrays.asList(path.split(","))).build();

      // Import data from the input URI
      InputConfig inputConfig = InputConfig.newBuilder().setGcsSource(gcsSource).build();
      System.out.println("Processing import...");

      // Start the import job
      OperationFuture<Empty, OperationMetadata> operation =
          client.importDataAsync(datasetFullId, inputConfig);

      System.out.format("Operation name: %s%n", operation.getName());

      // If you want to wait for the operation to finish, adjust the timeout appropriately. The
      // operation will still run if you choose not to wait for it to complete. You can check the
      // status of your operation using the operation's name.
      Empty response = operation.get(45, TimeUnit.MINUTES);
      System.out.format("Dataset imported. %s%n", response);
    } catch (TimeoutException e) {
      System.out.println("The operation's polling period was not long enough.");
      System.out.println("You can use the Operation's name to get the current status.");
      System.out.println("The import job is still running and will complete as expected.");
      throw e;
    }
  }
}

Node.js

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Node.js API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const datasetId = 'YOUR_DISPLAY_ID';
// const path = 'gs://BUCKET_ID/path_to_training_data.csv';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function importDataset() {
  // Construct request
  const request = {
    name: client.datasetPath(projectId, location, datasetId),
    inputConfig: {
      gcsSource: {
        inputUris: path.split(','),
      },
    },
  };

  // Import dataset
  console.log('Proccessing import');
  const [operation] = await client.importData(request);

  // Wait for operation to complete.
  const [response] = await operation.promise();
  console.log(`Dataset imported: ${response}`);
}

importDataset();

请求

运行 import_data 函数以导入训练内容。需要更改的第一部分代码是上一步中的数据集 ID,第二部分代码是 happiness.csv 的 URI。您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • 为数据集设置 dataset_id(来自上一步的输出)
  • 设置 path,这是 gs://YOUR_PROJECT_ID-lcm/csv/happiness.csv 的 URI

Python

python import_dataset.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.ImportDataset"

Node.js

node import_dataset.js

响应

Processing import...
Dataset imported.

第 3 步:创建(训练)模型

拥有已导入带标签训练文档的数据集后,您就可以训练新模型了。

复制代码

Python

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Python API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# dataset_id = "YOUR_DATASET_ID"
# display_name = "YOUR_MODEL_NAME"

client = automl.AutoMlClient()

# A resource that represents Google Cloud Platform location.
project_location = f"projects/{project_id}/locations/us-central1"
# Leave model unset to use the default base model provided by Google
metadata = automl.TextClassificationModelMetadata()
model = automl.Model(
    display_name=display_name,
    dataset_id=dataset_id,
    text_classification_model_metadata=metadata,
)

# Create a model with the model metadata in the region.
response = client.create_model(parent=project_location, model=model)

print(f"Training operation name: {response.operation.name}")
print("Training started...")

Java

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Java API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.LocationName;
import com.google.cloud.automl.v1.Model;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.cloud.automl.v1.TextClassificationModelMetadata;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class LanguageTextClassificationCreateModel {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String datasetId = "YOUR_DATASET_ID";
    String displayName = "YOUR_DATASET_NAME";
    createModel(projectId, datasetId, displayName);
  }

  // Create a model
  static void createModel(String projectId, String datasetId, String displayName)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // A resource that represents Google Cloud Platform location.
      LocationName projectLocation = LocationName.of(projectId, "us-central1");
      // Set model metadata.
      TextClassificationModelMetadata metadata =
          TextClassificationModelMetadata.newBuilder().build();
      Model model =
          Model.newBuilder()
              .setDisplayName(displayName)
              .setDatasetId(datasetId)
              .setTextClassificationModelMetadata(metadata)
              .build();

      // Create a model with the model metadata in the region.
      OperationFuture<Model, OperationMetadata> future =
          client.createModelAsync(projectLocation, model);
      // OperationFuture.get() will block until the model is created, which may take several hours.
      // You can use OperationFuture.getInitialFuture to get a future representing the initial
      // response to the request, which contains information while the operation is in progress.
      System.out.format("Training operation name: %s\n", future.getInitialFuture().get().getName());
      System.out.println("Training started...");
    }
  }
}

Node.js

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Node.js API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const dataset_id = 'YOUR_DATASET_ID';
// const displayName = 'YOUR_DISPLAY_NAME';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function createModel() {
  // Construct request
  const request = {
    parent: client.locationPath(projectId, location),
    model: {
      displayName: displayName,
      datasetId: datasetId,
      textClassificationModelMetadata: {}, // Leave unset, to use the default base model
    },
  };

  // Don't wait for the LRO
  const [operation] = await client.createModel(request);
  console.log(`Training started... ${operation}`);
  console.log(`Training operation name: ${operation.name}`);
}

createModel();

请求

调用 create_model 函数以创建模型。数据集 ID 来自之前的步骤。您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • 为数据集设置 dataset_id(来自上一步的输出)
  • 为您的模型设置 display_name (happydb_model)

Python

python language_text_classification_create_model.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.LanguageTextClassificationCreateModel"

Node.js

node language_text_classification_create_model.js

响应

create_model 函数开始训练操作并输出操作名称。训练是异步进行的,可能需要一段时间才能完成,因此您可以使用操作 ID 来检查训练状态。训练完成后,create_model 会返回模型 ID。与数据集 ID 一样,我们建议您将环境变量 MODEL_ID 设置为返回的模型 ID 值。

Training operation name: projects/216065747626/locations/us-central1/operations/TCN3007727620979824033
Training started...
Model name: projects/216065747626/locations/us-central1/models/TCN7683346839371803263
Model id: TCN7683346839371803263
Model display name: happydb_model
Model create time:
        seconds: 1529649600
        nanos: 966000000
Model deployment state: deployed

第 4 步:评估模型

训练结束后,您可以通过查看模型的精确率召回率F1 得分来评估其就绪情况。

display_evaluation 函数接受模型 ID 作为参数。

复制代码

Python

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Python API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# model_id = "YOUR_MODEL_ID"

client = automl.AutoMlClient()
# Get the full path of the model.
model_full_id = client.model_path(project_id, "us-central1", model_id)

print("List of model evaluations:")
for evaluation in client.list_model_evaluations(parent=model_full_id, filter=""):
    print(f"Model evaluation name: {evaluation.name}")
    print(f"Model annotation spec id: {evaluation.annotation_spec_id}")
    print(f"Create Time: {evaluation.create_time}")
    print(f"Evaluation example count: {evaluation.evaluated_example_count}")
    print(
        "Classification model evaluation metrics: {}".format(
            evaluation.classification_evaluation_metrics
        )
    )

Java

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Java API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证


import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.ListModelEvaluationsRequest;
import com.google.cloud.automl.v1.ModelEvaluation;
import com.google.cloud.automl.v1.ModelName;
import java.io.IOException;

class ListModelEvaluations {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String modelId = "YOUR_MODEL_ID";
    listModelEvaluations(projectId, modelId);
  }

  // List model evaluations
  static void listModelEvaluations(String projectId, String modelId) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // Get the full path of the model.
      ModelName modelFullId = ModelName.of(projectId, "us-central1", modelId);
      ListModelEvaluationsRequest modelEvaluationsrequest =
          ListModelEvaluationsRequest.newBuilder().setParent(modelFullId.toString()).build();

      // List all the model evaluations in the model by applying filter.
      System.out.println("List of model evaluations:");
      for (ModelEvaluation modelEvaluation :
          client.listModelEvaluations(modelEvaluationsrequest).iterateAll()) {

        System.out.format("Model Evaluation Name: %s\n", modelEvaluation.getName());
        System.out.format("Model Annotation Spec Id: %s", modelEvaluation.getAnnotationSpecId());
        System.out.println("Create Time:");
        System.out.format("\tseconds: %s\n", modelEvaluation.getCreateTime().getSeconds());
        System.out.format("\tnanos: %s", modelEvaluation.getCreateTime().getNanos() / 1e9);
        System.out.format(
            "Evalution Example Count: %d\n", modelEvaluation.getEvaluatedExampleCount());
        System.out.format(
            "Classification Model Evaluation Metrics: %s\n",
            modelEvaluation.getClassificationEvaluationMetrics());
      }
    }
  }
}

Node.js

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Node.js API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const modelId = 'YOUR_MODEL_ID';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function listModelEvaluations() {
  // Construct request
  const request = {
    parent: client.modelPath(projectId, location, modelId),
    filter: '',
  };

  const [response] = await client.listModelEvaluations(request);

  console.log('List of model evaluations:');
  for (const evaluation of response) {
    console.log(`Model evaluation name: ${evaluation.name}`);
    console.log(`Model annotation spec id: ${evaluation.annotationSpecId}`);
    console.log(`Model display name: ${evaluation.displayName}`);
    console.log('Model create time');
    console.log(`\tseconds ${evaluation.createTime.seconds}`);
    console.log(`\tnanos ${evaluation.createTime.nanos / 1e9}`);
    console.log(
      `Evaluation example count: ${evaluation.evaluatedExampleCount}`
    );
    console.log(
      `Classification model evaluation metrics: ${evaluation.classificationEvaluationMetrics}`
    );
  }
}

listModelEvaluations();

请求

通过执行以下请求来发出显示模型整体评估性能的请求。您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • model_id 设置为您的模型 ID

Python

python list_model_evaluations.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.ListModelEvaluations"

Node.js

node list_model_evaluations.js

响应

如果精确率和召回率分数过低,您可以强化训练数据集并重新训练模型。如需了解详情,请参阅评估模型

Precision and recall are based on a score threshold of 0.5
Model Precision: 96.3%
Model Recall: 95.7%
Model F1 score: 96.0%
Model Precision@1: 96.33%
Model Recall@1: 95.74%
Model F1 score@1: 96.04%

第 5 步:部署模型

当自定义模型达到您的质量标准时,就可以部署模型然后发出预测请求。

复制代码

Python

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Python API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# model_id = "YOUR_MODEL_ID"

client = automl.AutoMlClient()
# Get the full path of the model.
model_full_id = client.model_path(project_id, "us-central1", model_id)
response = client.deploy_model(name=model_full_id)

print(f"Model deployment finished. {response.result()}")

Java

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Java API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.DeployModelRequest;
import com.google.cloud.automl.v1.ModelName;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.protobuf.Empty;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class DeployModel {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String modelId = "YOUR_MODEL_ID";
    deployModel(projectId, modelId);
  }

  // Deploy a model for prediction
  static void deployModel(String projectId, String modelId)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // Get the full path of the model.
      ModelName modelFullId = ModelName.of(projectId, "us-central1", modelId);
      DeployModelRequest request =
          DeployModelRequest.newBuilder().setName(modelFullId.toString()).build();
      OperationFuture<Empty, OperationMetadata> future = client.deployModelAsync(request);

      future.get();
      System.out.println("Model deployment finished");
    }
  }
}

Node.js

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Node.js API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const modelId = 'YOUR_MODEL_ID';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function deployModel() {
  // Construct request
  const request = {
    name: client.modelPath(projectId, location, modelId),
  };

  const [operation] = await client.deployModel(request);

  // Wait for operation to complete.
  const [response] = await operation.promise();
  console.log(`Model deployment finished. ${response}`);
}

deployModel();

请求

对于 deploy_model 函数,您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • model_id 设置为您的模型 ID

Python

python deploy_model.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.DeployModel.java"

Node.js

node deploy_model.js

响应

Model deployment finished.

第 6 步:使用模型进行预测

部署模型后,您就可以使用它来对新内容进行分类了。

复制代码

Python

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Python API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# model_id = "YOUR_MODEL_ID"
# content = "text to predict"

prediction_client = automl.PredictionServiceClient()

# Get the full path of the model.
model_full_id = automl.AutoMlClient.model_path(project_id, "us-central1", model_id)

# Supported mime_types: 'text/plain', 'text/html'
# https://cloud.google.com/automl/docs/reference/rpc/google.cloud.automl.v1#textsnippet
text_snippet = automl.TextSnippet(content=content, mime_type="text/plain")
payload = automl.ExamplePayload(text_snippet=text_snippet)

response = prediction_client.predict(name=model_full_id, payload=payload)

for annotation_payload in response.payload:
    print(f"Predicted class name: {annotation_payload.display_name}")
    print(f"Predicted class score: {annotation_payload.classification.score}")

Java

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Java API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.cloud.automl.v1.AnnotationPayload;
import com.google.cloud.automl.v1.ExamplePayload;
import com.google.cloud.automl.v1.ModelName;
import com.google.cloud.automl.v1.PredictRequest;
import com.google.cloud.automl.v1.PredictResponse;
import com.google.cloud.automl.v1.PredictionServiceClient;
import com.google.cloud.automl.v1.TextSnippet;
import java.io.IOException;

class LanguageTextClassificationPredict {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String modelId = "YOUR_MODEL_ID";
    String content = "text to predict";
    predict(projectId, modelId, content);
  }

  static void predict(String projectId, String modelId, String content) throws IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (PredictionServiceClient client = PredictionServiceClient.create()) {
      // Get the full path of the model.
      ModelName name = ModelName.of(projectId, "us-central1", modelId);

      // For available mime types, see:
      // https://cloud.google.com/automl/docs/reference/rest/v1/projects.locations.models/predict#textsnippet
      TextSnippet textSnippet =
          TextSnippet.newBuilder()
              .setContent(content)
              .setMimeType("text/plain") // Types: text/plain, text/html
              .build();
      ExamplePayload payload = ExamplePayload.newBuilder().setTextSnippet(textSnippet).build();
      PredictRequest predictRequest =
          PredictRequest.newBuilder().setName(name.toString()).setPayload(payload).build();

      PredictResponse response = client.predict(predictRequest);

      for (AnnotationPayload annotationPayload : response.getPayloadList()) {
        System.out.format("Predicted class name: %s\n", annotationPayload.getDisplayName());
        System.out.format(
            "Predicted sentiment score: %.2f\n\n",
            annotationPayload.getClassification().getScore());
      }
    }
  }
}

Node.js

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Node.js API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const modelId = 'YOUR_MODEL_ID';
// const content = 'text to predict'

// Imports the Google Cloud AutoML library
const {PredictionServiceClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new PredictionServiceClient();

async function predict() {
  // Construct request
  const request = {
    name: client.modelPath(projectId, location, modelId),
    payload: {
      textSnippet: {
        content: content,
        mimeType: 'text/plain', // Types: 'text/plain', 'text/html'
      },
    },
  };

  const [response] = await client.predict(request);

  for (const annotationPayload of response.payload) {
    console.log(`Predicted class name: ${annotationPayload.displayName}`);
    console.log(
      `Predicted class score: ${annotationPayload.classification.score}`
    );
  }
}

predict();

请求

对于 predict 函数,您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • model_id 设置为您的模型 ID
  • 设置您要预测的 content

Python

python language_text_classification_predict.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.LanguageTextClassificationPredict"

Node.js

node language_text_classification_predict.js

响应

该函数返回分类得分,指示内容与各类别的匹配程度。

Prediction results:
Predicted class name: affection
Predicted class score: 0.9702693223953247

第 7 步:删除模型

使用完此示例模型后,您可以永久删除该模型,但之后您将无法再使用该模型进行预测。

复制代码

Python

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Python API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# model_id = "YOUR_MODEL_ID"

client = automl.AutoMlClient()
# Get the full path of the model.
model_full_id = client.model_path(project_id, "us-central1", model_id)
response = client.delete_model(name=model_full_id)

print(f"Model deleted. {response.result()}")

Java

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Java API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.ModelName;
import com.google.protobuf.Empty;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class DeleteModel {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String modelId = "YOUR_MODEL_ID";
    deleteModel(projectId, modelId);
  }

  // Delete a model
  static void deleteModel(String projectId, String modelId)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // Get the full path of the model.
      ModelName modelFullId = ModelName.of(projectId, "us-central1", modelId);

      // Delete a model.
      Empty response = client.deleteModelAsync(modelFullId).get();

      System.out.println("Model deletion started...");
      System.out.println(String.format("Model deleted. %s", response));
    }
  }
}

Node.js

如需了解如何安装和使用 AutoML Natural Language 的客户端库,请参阅 AutoML Natural Language 客户端库。 如需了解详情,请参阅 AutoML Natural Language Node.js API 参考文档

如需向 AutoML Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const modelId = 'YOUR_MODEL_ID';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function deleteModel() {
  // Construct request
  const request = {
    name: client.modelPath(projectId, location, modelId),
  };

  const [response] = await client.deleteModel(request);
  console.log(`Model deleted: ${response}`);
}

deleteModel();

请求

使用操作类型 delete_model 发送请求以删除您创建的模型。您需要修改以下代码行:

  • project_id 设置为您的 PROJECT_ID
  • model_id 设置为您的模型 ID

Python

python delete_model.py

Java

mvn compile exec:java -Dexec.mainClass="com.example.automl.DeleteModel"

Node.js

node delete_model.js

响应

Model deleted.