Analyzing entities

After your model has been successfully trained, and deployed you can analyze entities using it.

If you use the `batchPredict` the model does not need to be deployed.

Online prediction

If you like to query the model via a synchronous, online API you can use the predict method.

Web UI

  1. Open the AutoML Natural Language Entity Extraction UI and click the lightbulb icon in the left navigation bar to display the available models.

    To view the models for a different project, select the project from the drop-down list in the upper right of the title bar.

  2. Click the row for the model you want to use to classify your content.

  3. Click the Test and Use tab just below the title bar.

  4. Enter the content you want to analyze into the text box and click Predict.

    AutoML Natural Language Entity Extraction analyzes the text using your model and displays the annotations.

    Prediction results

The AutoML Natural Language Entity Extraction UI does not show low confidence predictions.

Command-line

Replace model-name with the full name of your model, from the response when you created the model. The full name has the format: projects/{project-id}/locations/us-central1/models/{model-id}

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -H "Content-Type: application/json" \
  https://automl.googleapis.com/v1beta1/model-name:predict \
  -d '{
        "payload" : {
          "textSnippet": {
               "content": "The Wilms tumor -suppressor gene , WT1 , plays a key
               role in urogenital development , and WT1 dysfunction is implicated
               in both neoplastic and nonneoplastic ( glomerulosclerosis ) disease.
               The analysis of diseases linked specifically with WT1 mutations,
               such as Denys-Drash syndrome ( DDS ), can provide valuable insight
               concerning the role of WT1 in development and disease.  We report
               that heterozygosity for a targeted murine Wt1 allele, Wt1 ( tmT396 ),
               which truncates ZF3 at codon 396 , induces mesangial sclerosis
               characteristic of DDS in adult heterozygous and chimeric mice.
               Male genital defects also were evident and there was a single case
               of Wilms tumor in which the transcript of the nontargeted allele
               showed an exon 9 skipping event, implying a causal link between
               Wt1 dysfunction and Wilms tumorigenesis in mice. However, the
               mutant WT1 ( tmT396 ) protein accounted for only 5 % of WT1 in both
               heterozygous embryonic stem cells and the WT. This has implications
               regarding the mechanism by which the mutant allele exerts its effect.",
                "mime_type": "text/plain"
           },
        }
      }'

Java


import com.google.cloud.automl.v1beta1.AnnotationPayload;
import com.google.cloud.automl.v1beta1.ExamplePayload;
import com.google.cloud.automl.v1beta1.ModelName;
import com.google.cloud.automl.v1beta1.PredictResponse;
import com.google.cloud.automl.v1beta1.PredictionServiceClient;
import com.google.cloud.automl.v1beta1.TextSnippet;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.HashMap;
import java.util.Map;

class Predict {

  static void predict(String projectId, String computeRegion, String modelId, String filePath)
      throws IOException {
    // String projectId = "YOUR_PROJECT_ID";
    // String computeRegion = "us-central1";
    // String filePath = "LOCAL_PATH_FOR_TEXT_FILE";

    // Create client for prediction service.
    try (PredictionServiceClient predictionClient = PredictionServiceClient.create()) {

      // Get full path of model
      ModelName modelName = ModelName.of(projectId, computeRegion, modelId);

      // Read the file content for prediction.
      String content = new String(Files.readAllBytes(Paths.get(filePath)));

      // Set the payload by giving the content and type of the file.
      TextSnippet textSnippet =
          TextSnippet.newBuilder().setContent(content).setMimeType("text/plain").build();
      ExamplePayload payload = ExamplePayload.newBuilder().setTextSnippet(textSnippet).build();

      // params is additional domain-specific parameters.
      // currently there is no additional parameters supported.
      Map<String, String> params = new HashMap<String, String>();
      PredictResponse response = predictionClient.predict(modelName, payload, params);

      System.out.println("Prediction results:");
      for (AnnotationPayload annotationPayload : response.getPayloadList()) {
        System.out.println(
            "Predicted Text Extract Entity Type :" + annotationPayload.getDisplayName());
        System.out.println(
            "Predicted Text Extract Entity Content :"
                + annotationPayload.getTextExtraction().getTextSegment().getContent());
        System.out.println(
            "Predicted Text Start Offset :"
                + annotationPayload.getTextExtraction().getTextSegment().getStartOffset());
        System.out.println(
            "Predicted Text End Offset :"
                + annotationPayload.getTextExtraction().getTextSegment().getEndOffset());
        System.out.println(
            "Predicted Text Score :" + annotationPayload.getTextExtraction().getScore());
      }
    }
  }
}

Node.js

const automl = require(`@google-cloud/automl`);
const fs = require(`fs`);

// Create client for prediction service.
const client = new automl.v1beta1.PredictionServiceClient();

/**
 * Demonstrates using the AutoML client to Extract the text content
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const projectId = '[PROJECT_ID]' e.g., "my-gcloud-project";
// const computeRegion = '[REGION_NAME]' e.g., "us-central1";
// const modelId = '[MODEL_ID]' e.g., "TEN5200971474357190656";
// const filePath = '[LOCAL_FILE_PATH]' e.g., "./resource/test.txt",
// `local text file path of content to be extracted`;

// Get the full path of the model.
const modelFullId = client.modelPath(projectId, computeRegion, modelId);

// Read the file content for prediction.
const snippet = fs.readFileSync(filePath, `utf8`);

// Set the payload by giving the content and type of the file.
const payload = {
  textSnippet: {
    content: snippet,
    mimeType: `text/plain`,
  },
};

// Params is additional domain-specific parameters.
// Currently there is no additional parameters supported.
client
  .predict({name: modelFullId, payload: payload, params: {}})
  .then(responses => {
    console.log(`Prediction results:`);
    for (const result of responses[0].payload) {
      console.log(
        `\tPredicted text extract entity type: ${result.displayName}`
      );
      console.log(
        `\tPredicted text extract entity content: ${
          result.textExtraction.textSegment.content
        }`
      );
      console.log(
        `\tPredicted text start offset: ${
          result.textExtraction.textSegment.startOffset
        }`
      );
      console.log(
        `\tPredicted text end offset: ${
          result.textExtraction.textSegment.endOffset
        }`
      );
      console.log(
        `\tPredicted text score: ${result.textExtraction.score} \n`
      );
    }
  })
  .catch(err => {
    console.error(err);
  });

Python

    # TODO(developer): Uncomment and set the following variables
    # project_id = '[PROJECT_ID]'
    # compute_region = '[COMPUTE_REGION]'
    # model_id = '[MODEL_ID]'
    # file_path = '/local/path/to/file'

    from google.cloud import automl_v1beta1 as automl

    automl_client = automl.AutoMlClient()

    # Create client for prediction service.
    prediction_client = automl.PredictionServiceClient()

    # Get the full path of the model.
    model_full_id = automl_client.model_path(
        project_id, compute_region, model_id
    )

    # Read the file content for prediction.
    with open(file_path, "rb") as content_file:
        snippet = content_file.read()

    # Set the payload by giving the content and type of the file.
    payload = {"text_snippet": {"content": snippet, "mime_type": "text/plain"}}

    # params is additional domain-specific parameters.
    # currently there is no additional parameters supported.
    params = {}
    response = prediction_client.predict(model_full_id, payload, params)
    print("Prediction results:")
    for result in response.payload:
        print("Predicted entity label: {}".format(result.display_name))
        print("Predicted confidence score: {}".format(result.text_extraction.score))
        print("Predicted text segment: {}".format(result.text_extraction.text_segment.content))
        print("Predicted text segment start offset: {}".format(result.text_extraction.text_segment.start_offset))
        print("Predicted text segment end offset : {}".format(result.text_extraction.text_segment.end_offset))
        print("\n")

Batch prediction

If you would like to use your model to do high-throughput asynchronous prediction on a corpus of text items you can use the batchPredict method.

To use batch prediction the input data has to be specially prepared and formatted in a Google Cloud Storage bucket. Similar to the training data preparation you will need to prepare a JSONL file that contains all the text items that you would like to analyze. Each text item should be linked to a unique id.

{ "id": "0", "text_snippet": { "content": "First item content to be analyzed." } }
{ "id": "1", "text_snippet": { "content": "Second item content to be analyzed." } }
...
{ "id": "n", "text_snippet": { "content": "Last item content to be analyzed." } }

Command-line

Replace model-name with the full name of your model, from the response when you created the model. The full name has the format: projects/{project-id}/locations/us-central1/models/{model-id}. And replace json-file-URI with the path to the JSONL input file you prepared in your Google Cloud Storage Bucket, and the dest-dir-URI with the directory to put the analysis results ( also in your Google Cloud Storage bucket.)


curl \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -H "Content-Type: application/json" \
  https://automl.googleapis.com/v1beta1/model-name:batchPredict \
  -d '{
        "input_config": { "gcs_source": { "input_uris": [ "json-file-URI"] } },
        "output_config": { "gcs_destination": { "output_uri_prefix": "dest-dir-URI" } }
      }'

You should see output similar to the following:

{
  "name": "projects/000000000000/locations/us-central1/operations/TEN8195786061721370625",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "createTime": "2019-03-13T15:37:49.972372Z",
    "updateTime": "2019-03-13T15:37:49.972372Z"
  }
}

The returned operation name can be queried as below

Replace operation-name with the operation name returned by the batchPredict method.

curl \
  -X GET \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -H "Content-Type: application/json" \
  https://automl.googleapis.com/v1beta1/operation-name

Once the operation has finished successfully the dest-dir-URI directory will contain a JSONL file with the results.

Var denne side nyttig? Giv os en anmeldelse af den:

Send feedback om...

AutoML Natural Language Entity Extraction