Training models

After you have a dataset with a solid set of labeled training documents, you are ready to create and train the custom model.

Training a model can take several hours to complete. The required training time depends on several factors, such as the size of the dataset, the nature of the training items, and the complexity of the models. AutoML Natural Language uses early stopping to produce the best possible model without overfitting.

For classification models, the average training time is around 6 hours, with a maximum of 24 hours. For entity extraction and sentiment analysis modes, the average training time is 5 hours, with a maximum of 6 hours.

After a model is successfully trained, you receive a message at the email address associated with your project.

The maximum lifespan for a custom model is 18 months. You must create and train a new model to continue making predictions after that amount of time.

Web UI

To train a model:

  1. Open the AutoML Natural Language UI and select Get started in the box corresponding to the type of model you plan to train.

    The Datasets page appears, showing the status of previously created datasets for the current project. To train using a dataset for a different project, select the project from the drop-down list in the upper right of the title bar.

  2. Select the dataset you want to use to train the custom model.

    The display name of the selected dataset appears in the title bar, and the page lists the individual documents in the dataset along with their labels.

    Text items page

  3. When you are done reviewing the dataset, click the Train tab just below the title bar.

    If you are about to train the first model from this dataset, the training page provides a basic analysis of the dataset and advises you about whether it is adequate for training. If AutoML Natural Language suggests changes, consider returning to the Text items page and adding documents or labels.

    If you have trained other models from this dataset, the training page displays the basic evaluation metrics for those models.

  4. Click Start Training.

  5. Enter a name for the model.

    The model name can be up to 32 characters and contain only letters, numbers and underscores. The first character must be a letter.

  6. (Optional): To train an entity extraction model for healthcare terminology, select Enable Healthcare Entity Extraction (beta). This option enables you to start with a healthcare-tuned model that is optimized for processing healthcare data. For more information, see AutoML Entity Extraction for Healthcare.

  7. Select the Deploy model after training finishes check box if you want to deploy the model automatically.

  8. Click Start Training.

Code samples

Classification

REST

Before using any of the request data, make the following replacements:

  • project-id: your project ID
  • location-id: the location for the resource, us-central1 for the Global location or eu for the European Union
  • dataset-id: your dataset ID

HTTP method and URL:

POST https://automl.googleapis.com/v1/projects/project-id/locations/location-id/models

Request JSON body:

{
  "displayName": "test_model",
  "dataset_id": "dataset-id",
  "textClassificationModelMetadata": {
   }
}

To send your request, expand one of these options:

You should see output similar to the following. You can use the operation ID to get the status of the task. For an example, see Getting the status of an operation.

{
  "name": "projects/434039606874/locations/us-central1/operations/1979469554520652445",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "createTime": "2018-04-27T01:28:41.338120Z",
    "updateTime": "2018-04-27T01:28:41.338120Z",
    "cancellable": true
  }
}

Python

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Python API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# dataset_id = "YOUR_DATASET_ID"
# display_name = "YOUR_MODEL_NAME"

client = automl.AutoMlClient()

# A resource that represents Google Cloud Platform location.
project_location = f"projects/{project_id}/locations/us-central1"
# Leave model unset to use the default base model provided by Google
metadata = automl.TextClassificationModelMetadata()
model = automl.Model(
    display_name=display_name,
    dataset_id=dataset_id,
    text_classification_model_metadata=metadata,
)

# Create a model with the model metadata in the region.
response = client.create_model(parent=project_location, model=model)

print(f"Training operation name: {response.operation.name}")
print("Training started...")

Java

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Java API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.LocationName;
import com.google.cloud.automl.v1.Model;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.cloud.automl.v1.TextClassificationModelMetadata;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class LanguageTextClassificationCreateModel {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String datasetId = "YOUR_DATASET_ID";
    String displayName = "YOUR_DATASET_NAME";
    createModel(projectId, datasetId, displayName);
  }

  // Create a model
  static void createModel(String projectId, String datasetId, String displayName)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // A resource that represents Google Cloud Platform location.
      LocationName projectLocation = LocationName.of(projectId, "us-central1");
      // Set model metadata.
      TextClassificationModelMetadata metadata =
          TextClassificationModelMetadata.newBuilder().build();
      Model model =
          Model.newBuilder()
              .setDisplayName(displayName)
              .setDatasetId(datasetId)
              .setTextClassificationModelMetadata(metadata)
              .build();

      // Create a model with the model metadata in the region.
      OperationFuture<Model, OperationMetadata> future =
          client.createModelAsync(projectLocation, model);
      // OperationFuture.get() will block until the model is created, which may take several hours.
      // You can use OperationFuture.getInitialFuture to get a future representing the initial
      // response to the request, which contains information while the operation is in progress.
      System.out.format("Training operation name: %s\n", future.getInitialFuture().get().getName());
      System.out.println("Training started...");
    }
  }
}

Node.js

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Node.js API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const dataset_id = 'YOUR_DATASET_ID';
// const displayName = 'YOUR_DISPLAY_NAME';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function createModel() {
  // Construct request
  const request = {
    parent: client.locationPath(projectId, location),
    model: {
      displayName: displayName,
      datasetId: datasetId,
      textClassificationModelMetadata: {}, // Leave unset, to use the default base model
    },
  };

  // Don't wait for the LRO
  const [operation] = await client.createModel(request);
  console.log(`Training started... ${operation}`);
  console.log(`Training operation name: ${operation.name}`);
}

createModel();

Go

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Go API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import (
	"context"
	"fmt"
	"io"

	automl "cloud.google.com/go/automl/apiv1"
	"cloud.google.com/go/automl/apiv1/automlpb"
)

// languageTextClassificationCreateModel creates a model for text classification.
func languageTextClassificationCreateModel(w io.Writer, projectID string, location string, datasetID string, modelName string) error {
	// projectID := "my-project-id"
	// location := "us-central1"
	// datasetID := "TCN123456789..."
	// modelName := "model_display_name"

	ctx := context.Background()
	client, err := automl.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("NewClient: %w", err)
	}
	defer client.Close()

	req := &automlpb.CreateModelRequest{
		Parent: fmt.Sprintf("projects/%s/locations/%s", projectID, location),
		Model: &automlpb.Model{
			DisplayName: modelName,
			DatasetId:   datasetID,
			ModelMetadata: &automlpb.Model_TextClassificationModelMetadata{
				TextClassificationModelMetadata: &automlpb.TextClassificationModelMetadata{},
			},
		},
	}

	op, err := client.CreateModel(ctx, req)
	if err != nil {
		return fmt.Errorf("CreateModel: %w", err)
	}
	fmt.Fprintf(w, "Processing operation name: %q\n", op.Name())
	fmt.Fprintf(w, "Training started...\n")

	return nil
}

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the AutoML Natural Language reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the AutoML Natural Language reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the AutoML Natural Language reference documentation for Ruby.

Entity extraction

REST

Before using any of the request data, make the following replacements:

  • project-id: your project ID
  • location-id: the location for the resource, us-central1 for the Global location or eu for the European Union
  • dataset-id: your dataset ID
  • model-hint: the baseline model to use, such as default or healthcare (beta).

HTTP method and URL:

POST https://automl.googleapis.com/v1/projects/project-id/locations/location-id/models

Request JSON body:

{
  "displayName": "test_model",
  "dataset_id": "dataset-id",
  "textExtractionModelMetadata": {
    "model_hint": "model-hint"
  }
}

To send your request, expand one of these options:

You should see output similar to the following. You can use the operation ID to get the status of the task. For an example, see Getting the status of an operation.

{
  "name": "projects/434039606874/locations/us-central1/operations/1979469554520652445",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "createTime": "2018-04-27T01:28:41.338120Z",
    "updateTime": "2018-04-27T01:28:41.338120Z",
    "cancellable": true
  }
}

Python

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Python API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# dataset_id = "YOUR_DATASET_ID"
# display_name = "YOUR_MODEL_NAME"

client = automl.AutoMlClient()

# A resource that represents Google Cloud Platform location.
project_location = f"projects/{project_id}/locations/us-central1"
# Leave model unset to use the default base model provided by Google
metadata = automl.TextExtractionModelMetadata()
model = automl.Model(
    display_name=display_name,
    dataset_id=dataset_id,
    text_extraction_model_metadata=metadata,
)

# Create a model with the model metadata in the region.
response = client.create_model(parent=project_location, model=model)

print(f"Training operation name: {response.operation.name}")
print("Training started...")

Java

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Java API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.LocationName;
import com.google.cloud.automl.v1.Model;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.cloud.automl.v1.TextExtractionModelMetadata;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class LanguageEntityExtractionCreateModel {

  static void createModel() throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String datasetId = "YOUR_DATASET_ID";
    String displayName = "YOUR_DATASET_NAME";
    createModel(projectId, datasetId, displayName);
  }

  // Create a model
  static void createModel(String projectId, String datasetId, String displayName)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // A resource that represents Google Cloud Platform location.
      LocationName projectLocation = LocationName.of(projectId, "us-central1");
      // Set model metadata.
      TextExtractionModelMetadata metadata = TextExtractionModelMetadata.newBuilder().build();
      Model model =
          Model.newBuilder()
              .setDisplayName(displayName)
              .setDatasetId(datasetId)
              .setTextExtractionModelMetadata(metadata)
              .build();

      // Create a model with the model metadata in the region.
      OperationFuture<Model, OperationMetadata> future =
          client.createModelAsync(projectLocation, model);
      // OperationFuture.get() will block until the model is created, which may take several hours.
      // You can use OperationFuture.getInitialFuture to get a future representing the initial
      // response to the request, which contains information while the operation is in progress.
      System.out.format("Training operation name: %s\n", future.getInitialFuture().get().getName());
      System.out.println("Training started...");
    }
  }
}

Node.js

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Node.js API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const dataset_id = 'YOUR_DATASET_ID';
// const displayName = 'YOUR_DISPLAY_NAME';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function createModel() {
  // Construct request
  const request = {
    parent: client.locationPath(projectId, location),
    model: {
      displayName: displayName,
      datasetId: datasetId,
      textExtractionModelMetadata: {}, // Leave unset, to use the default base model
    },
  };

  // Don't wait for the LRO
  const [operation] = await client.createModel(request);
  console.log(`Training started... ${operation}`);
  console.log(`Training operation name: ${operation.name}`);
}

createModel();

Go

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Go API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import (
	"context"
	"fmt"
	"io"

	automl "cloud.google.com/go/automl/apiv1"
	"cloud.google.com/go/automl/apiv1/automlpb"
)

// languageEntityExtractionCreateModel creates a model for text entity extraction.
func languageEntityExtractionCreateModel(w io.Writer, projectID string, location string, datasetID string, modelName string) error {
	// projectID := "my-project-id"
	// location := "us-central1"
	// datasetID := "TEN123456789..."
	// modelName := "model_display_name"

	ctx := context.Background()
	client, err := automl.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("NewClient: %w", err)
	}
	defer client.Close()

	req := &automlpb.CreateModelRequest{
		Parent: fmt.Sprintf("projects/%s/locations/%s", projectID, location),
		Model: &automlpb.Model{
			DisplayName: modelName,
			DatasetId:   datasetID,
			ModelMetadata: &automlpb.Model_TextExtractionModelMetadata{
				TextExtractionModelMetadata: &automlpb.TextExtractionModelMetadata{},
			},
		},
	}

	op, err := client.CreateModel(ctx, req)
	if err != nil {
		return fmt.Errorf("CreateModel: %w", err)
	}
	fmt.Fprintf(w, "Processing operation name: %q\n", op.Name())
	fmt.Fprintf(w, "Training started...\n")

	return nil
}

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the AutoML Natural Language reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the AutoML Natural Language reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the AutoML Natural Language reference documentation for Ruby.

Sentiment analysis

REST

Before using any of the request data, make the following replacements:

  • project-id: your project ID
  • location-id: the location for the resource, us-central1 for the Global location or eu for the European Union
  • dataset-id: your dataset ID

HTTP method and URL:

POST https://automl.googleapis.com/v1/projects/project-id/locations/location-id/models

Request JSON body:

{
  "displayName": "test_model",
  "dataset_id": "dataset-id",
  "textSentimentModelMetadata": {
  }
}

To send your request, expand one of these options:

You should see output similar to the following. You can use the operation ID to get the status of the task. For an example, see Getting the status of an operation.

{
  "name": "projects/434039606874/locations/us-central1/operations/1979469554520652445",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "createTime": "2018-04-27T01:28:41.338120Z",
    "updateTime": "2018-04-27T01:28:41.338120Z",
    "cancellable": true
  }
}

Python

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Python API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

from google.cloud import automl

# TODO(developer): Uncomment and set the following variables
# project_id = "YOUR_PROJECT_ID"
# dataset_id = "YOUR_DATASET_ID"
# display_name = "YOUR_MODEL_NAME"

client = automl.AutoMlClient()

# A resource that represents Google Cloud Platform location.
project_location = f"projects/{project_id}/locations/us-central1"
# Leave model unset to use the default base model provided by Google
metadata = automl.TextSentimentModelMetadata()
model = automl.Model(
    display_name=display_name,
    dataset_id=dataset_id,
    text_sentiment_model_metadata=metadata,
)

# Create a model with the model metadata in the region.
response = client.create_model(parent=project_location, model=model)

print(f"Training operation name: {response.operation.name}")
print("Training started...")

Java

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Java API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.automl.v1.AutoMlClient;
import com.google.cloud.automl.v1.LocationName;
import com.google.cloud.automl.v1.Model;
import com.google.cloud.automl.v1.OperationMetadata;
import com.google.cloud.automl.v1.TextSentimentModelMetadata;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

class LanguageSentimentAnalysisCreateModel {

  static void createModel() throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "YOUR_PROJECT_ID";
    String datasetId = "YOUR_DATASET_ID";
    String displayName = "YOUR_DATASET_NAME";
    createModel(projectId, datasetId, displayName);
  }

  // Create a model
  static void createModel(String projectId, String datasetId, String displayName)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (AutoMlClient client = AutoMlClient.create()) {
      // A resource that represents Google Cloud Platform location.
      LocationName projectLocation = LocationName.of(projectId, "us-central1");
      // Set model metadata.
      System.out.println(datasetId);
      TextSentimentModelMetadata metadata = TextSentimentModelMetadata.newBuilder().build();
      Model model =
          Model.newBuilder()
              .setDisplayName(displayName)
              .setDatasetId(datasetId)
              .setTextSentimentModelMetadata(metadata)
              .build();

      // Create a model with the model metadata in the region.
      OperationFuture<Model, OperationMetadata> future =
          client.createModelAsync(projectLocation, model);
      // OperationFuture.get() will block until the model is created, which may take several hours.
      // You can use OperationFuture.getInitialFuture to get a future representing the initial
      // response to the request, which contains information while the operation is in progress.
      System.out.format("Training operation name: %s\n", future.getInitialFuture().get().getName());
      System.out.println("Training started...");
    }
  }
}

Node.js

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Node.js API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'us-central1';
// const dataset_id = 'YOUR_DATASET_ID';
// const displayName = 'YOUR_DISPLAY_NAME';

// Imports the Google Cloud AutoML library
const {AutoMlClient} = require('@google-cloud/automl').v1;

// Instantiates a client
const client = new AutoMlClient();

async function createModel() {
  // Construct request
  const request = {
    parent: client.locationPath(projectId, location),
    model: {
      displayName: displayName,
      datasetId: datasetId,
      textSentimentModelMetadata: {}, // Leave unset, to use the default base model
    },
  };

  // Don't wait for the LRO
  const [operation] = await client.createModel(request);
  console.log(`Training started... ${operation}`);
  console.log(`Training operation name: ${operation.name}`);
}

createModel();

Go

To learn how to install and use the client library for AutoML Natural Language, see AutoML Natural Language client libraries. For more information, see the AutoML Natural Language Go API reference documentation.

To authenticate to AutoML Natural Language, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import (
	"context"
	"fmt"
	"io"

	automl "cloud.google.com/go/automl/apiv1"
	"cloud.google.com/go/automl/apiv1/automlpb"
)

// languageSentimentAnalysisCreateModel creates a model for text sentiment analysis.
func languageSentimentAnalysisCreateModel(w io.Writer, projectID string, location string, datasetID string, modelName string) error {
	// projectID := "my-project-id"
	// location := "us-central1"
	// datasetID := "TST123456789..."
	// modelName := "model_display_name"

	ctx := context.Background()
	client, err := automl.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("NewClient: %w", err)
	}
	defer client.Close()

	req := &automlpb.CreateModelRequest{
		Parent: fmt.Sprintf("projects/%s/locations/%s", projectID, location),
		Model: &automlpb.Model{
			DisplayName: modelName,
			DatasetId:   datasetID,
			ModelMetadata: &automlpb.Model_TextSentimentModelMetadata{
				TextSentimentModelMetadata: &automlpb.TextSentimentModelMetadata{},
			},
		},
	}

	op, err := client.CreateModel(ctx, req)
	if err != nil {
		return fmt.Errorf("CreateModel: %w", err)
	}
	fmt.Fprintf(w, "Processing operation name: %q\n", op.Name())
	fmt.Fprintf(w, "Training started...\n")

	return nil
}

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the AutoML Natural Language reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the AutoML Natural Language reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the AutoML Natural Language reference documentation for Ruby.