Track ML metadata

Vertex ML Metadata lets you track and analyze the metadata produced by your machine learning (ML) workflows. If you are new to Vertex ML Metadata, read the introduction to Vertex ML Metadata to learn more about tracking and analyzing your ML workflow's metadata.

This guide demonstrates how to log metadata using the following process:

  1. Create an execution representing a step in your ML workflow.
  2. Query to find any input artifacts that are already written to your metadata store. Artifacts represent data used or produced by your ML workflow, such as datasets and models.
  3. Create artifacts for your execution's inputs that are not already written to your metadata store, and any outputs produced by this execution.
  4. Create events to represent the relationship between your execution, and its input and output artifacts.
  5. Optionally, add your execution and artifacts to a context. You can use a context to group together sets of executions and artifacts. For example, if you are experimenting to find the best set of hyperparameters to train a model, each experiment may be a different execution with its own set of parameters and metrics. You can compare the executions within a context to find the experiment that produced the best model.

    Before you can add execution and artifacts to a context, you must create a context.

Before you begin

The first time that you use Vertex ML Metadata in a Google Cloud project, Vertex AI creates your project's metadata store.

If you want your metadata encrypted using a customer-managed encryption key (CMEK), you must create your metadata store using a CMEK before you use Vertex ML Metadata to track or analyze metadata. Use the create a metadata store that uses a CMEK instructions to configure your project's metadata store.

Create an execution

Executions represent a step in your ML workflow. Use the following instructions to create an execution.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the execution is created. The default metadata store is named default.
  • EXECUTION_ID: (Optional.) The ID of the execution record. If the execution ID is not specified, Vertex ML Metadata created a unique identifier for this execution.
  • DISPLAY_NAME: The execution's display name. This field may contain up to 128 Unicode characters.
  • EXECUTION_STATE: (Optional.) A value from the State enumeration that represents the current state of the execution. This field is managed by client applications. Vertex ML Metadata does not check the validity of state transitions.
  • METADATA_SCHEMA_TITLE: The title of the schema that describes the metadata field.
  • METADATA_SCHEMA_VERSION: (Optional.) The version of the schema that describes the metadata field.
  • METADATA: (Optional.) Properties that describe the execution, such as the execution parameters.
  • DESCRIPTION: (Optional.) A description of the execution.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/executions?executionId=EXECUTION_ID

Request JSON body:

{
  "displayName": "DISPLAY_NAME",
  "state": "EXECUTION_STATE",
  "schemaTitle": "METADATA_SCHEMA_TITLE",
  "schemaVersion": "METADATA_SCHEMA_VERSION",
  "metadata": {
    METADATA
  },
  "description": "DESCRIPTION"
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/12345/locations/us-central1/metadataStores/default/executions/example-execution",
  "displayName": "Example Execution",
  "etag": "67891011",
  "createTime": "2021-05-18T00:04:49.659Z",
  "updateTime": "2021-05-18T00:04:49.659Z",
  "schemaTitle": "system.Run",
  "schemaVersion": "0.0.1",
  "metadata": {},
  "description": "Description of the example execution."
}

Look up an existing artifact

Artifacts represent data used or produced by your ML workflow, such as datasets and models. Use the following instructions to look-up an existing artifact.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the artifact is created. The default metadata store is named default.
  • PAGE_SIZE: (Optional.) The maximum number of artifacts to return. If this value is not specified, a maximum of 100 records are returned.
  • PAGE_TOKEN: (Optional.) A page token from a previous MetadataService.ListArtifacts call. Specify this token to get the next page of results.
  • FILTER: Specifies the conditions required to include an artifact in the result set.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/artifacts?pageSize=PAGE_SIZE&pageToken=PAGE_TOKEN&filter=FILTER

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "artifacts": [
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact",
      "displayName": "Example artifact",
      "uri": "gs://your_bucket_name/artifacts/dataset.csv",
      "etag": "67891011",
      "createTime": "2021-05-18T00:33:13.833Z",
      "updateTime": "2021-05-18T00:33:13.833Z",
      "state": "LIVE",
      "schemaTitle": "system.Dataset",
      "schemaVersion": "0.0.1",
      "metadata": {
        "payload_format": "CSV"
      },
      "description": "Description of the example artifact."
    },
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact-2",
      "displayName": "Another example artifact",
      "uri": "gs://your_bucket_name/artifacts/dataset-2.csv",
      "etag": "67891012",
      "createTime": "2021-05-18T00:29:24.344Z",
      "updateTime": "2021-05-18T00:29:24.344Z",
      "state": "LIVE",
      "schemaTitle": "system.Dataset",
      "schemaVersion": "0.0.1",
      "metadata": {
        "payload_format": "CSV"
      },
      "description": "Description of the other example artifact."
    }
  ]
}

Create an artifact

Use the following instructions to create an artifact.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the artifact is created. The default metadata store is named default.
  • ARTIFACT_ID: (Optional.) The ID of the artifact record. If the artifact ID is not specified, Vertex ML Metadata created a unique identifier for this artifact.
  • DISPLAY_NAME: The artifact's display name. This field may contain up to 128 Unicode characters.
  • URI: (Optional.) The location where the artifact is stored.
  • ARTIFACT_STATE: (Optional.) A value from the State enumeration that represents the current state of the artifact. This field is managed by client applications. Vertex ML Metadata does not check the validity of state transitions.
  • METADATA_SCHEMA_TITLE: The title of the schema that describes the metadata field.
  • METADATA_SCHEMA_VERSION: (Optional.) The version of the schema that describes the metadata field.
  • METADATA: (Optional.) Properties that describe the artifact, such as the type of dataset.
  • DESCRIPTION: (Optional.) A description of the execution.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/artifacts?artifactId=ARTIFACT_ID

Request JSON body:

{
  "displayName": "DISPLAY_NAME",
  "uri": "URI",
  "state": "ARTIFACT_STATE",
  "schemaTitle": "METADATA_SCHEMA_TITLE",
  "schemaVersion": "METADATA_SCHEMA_VERSION",
  "metadata": {
    METADATA
  },
  "description": "DESCRIPTION"
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact",
  "displayName": "Example artifact",
  "uri": "gs://your_bucket_name/artifacts/dataset.csv",
  "etag": "67891011",
  "createTime": "2021-05-18T00:29:24.344Z",
  "updateTime": "2021-05-18T00:29:24.344Z",
  "state": "LIVE",
  "schemaTitle": "system.Dataset",
  "schemaVersion": "0.0.1",
  "metadata": {
    "payload_format": "CSV"
  },
  "description": "Description of the example artifact."
}

Create events to link artifacts to an execution

Events represent the relationship between an execution, and its inputs and output artifacts. Use the following instructions to create events to link artifacts to an execution.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the execution is created. The default metadata store is named default.
  • EXECUTION: The ID of the execution record.
  • ARTIFACT: The resource name of the artifact. The resource name is formatted like the following:

    projects/project/locations/location/metadataStores/metadata-store/artifacts/artifact
  • EVENT_TYPE: (Optional.) A value from the EventType enumeration that specifies if the artifact is an input or output of the execution.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/executions/EXECUTION:addExecutionEvents

Request JSON body:

{
  "events": [
    {
      "artifact": "ARTIFACT",
      "type": "EVENT_TYPE"
    }
  ]
}

To send your request, expand one of these options:

You should receive a successful status code (2xx) and an empty response.

Create a context

Contexts let you group sets of artifacts and executions together. Use the following instructions to create a context.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the execution is created. The default metadata store is named default.
  • CONTEXT_ID: (Optional.) The ID of the context record. If the context ID is not specified, Vertex ML Metadata created a unique identifier for this context.
  • DISPLAY_NAME: The context's display name. This field may contain up to 128 Unicode characters.
  • Specify the PARENT_CONTEXT resource name for any parent contexts. A context can not have more than 10 parent contexts.
  • METADATA_SCHEMA_TITLE: The title of the schema that describes the metadata field.
  • METADATA_SCHEMA_VERSION: (Optional.) The version of the schema that describes the metadata field.
  • METADATA: (Optional.) Properties that describe the context.
  • DESCRIPTION: (Optional.) A description of the execution.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/contexts?contextId=CONTEXT_ID

Request JSON body:

{
  "displayName": "DISPLAY_NAME:",
  "parentContexts": [
    "PARENT_CONTEXT_1",
    "PARENT_CONTEXT_2"
  ],
  "schemaTitle": "METADATA_SCHEMA_TITLE",
  "schemaVersion": "METADATA_SCHEMA_VERSION",
  "metadata": {
    METADATA
  },
  "description": "DESCRIPTION"
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/12345/locations/us-central1/metadataStores/default/contexts/example-context",
  "displayName": "Example context:",
  "etag": "67891011",
  "createTime": "2021-05-18T01:52:51.642Z",
  "updateTime": "2021-05-18T01:52:51.642Z",
  "schemaTitle": "system.Experiment",
  "schemaVersion": "0.0.1",
  "metadata": {},
  "description": "Description of the example context."
}

Add artifacts and executions to a context

Use the following instructions to add artifacts and executions to a context.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the execution is created. The default metadata store is named default.
  • CONTEXT: The ID of the context record.
  • Specify the ARTIFACT resource name for any artifacts that you want to add to this context. The resource name is formatted like the following:

    projects/project/locations/location/metadataStores/metadata-store/artifacts/artifact
  • Specify the EXECUTION resource name for any executions that you want to add to this context. The resource name is formatted like the following:

    projects/project/locations/location/metadataStores/metadata-store/executions/execution

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/contexts/CONTEXT:addContextArtifactsAndExecutions

Request JSON body:

{
  "artifacts": [
    "ARTIFACT_1",
    "ARTIFACT_2"
  ],
  "executions": [
    "EXECUTION_1",
    "EXECUTION_2"
  ]
}

To send your request, expand one of these options:

You should receive a successful status code (2xx) and an empty response.

What's next