Analyze ML metadata

You can use Vertex ML Metadata to track and analyze the metadata produced by your machine learning (ML) systems. By tracking this metadata, it becomes easier to analyze the behavior of your ML system. This can help you to understand changes in your system's performance or to compare the artifacts that your ML system produces.

If you are new to Vertex ML Metadata, read the introduction to Vertex ML Metadata to learn more about tracking and analyzing your ML workflow's metadata.

This document describes how to query for the ML metadata that you want to analyze in the following ways:

Query for artifacts, executions, and contexts

You can use the REST API to query for artifacts, executions, and contexts records using filters to create queries like the following:

  • Which versions of a trained model achieved a certain quality threshold?
  • Which pipeline runs used a certain dataset?

The following sections demonstrate how to create filters and how to query for artifacts, executions, and contexts.

Overview of filter syntax

The following sections describe how to use filters to query for artifacts, executions, and contexts.

Fields

The following fields are supported when filtering of artifacts, executions, and contexts.

Artifact Execution Context
name
display_name
schema_title
create_time
update_time
metadata
state
uri

Your filter must be wrapped in quotes, and any quotes that are a part of your filter must be escaped with a backslash.

Comparison operators

You can use the following comparison operators in your filters: =, !=, <, >, >=, <=.

For example, the following filters to find all artifacts where the display name is my_artifact.

"display_name=\"my_artifact\""

For string fields, you can use wildcard filtering with the * character.

For timestamp fields such as create_time and update_time, you must format the date using RFC 3339 format — for example:

"create_time=\"2021-05-11T12:30:00-08:00\""

Filter on metadata using the traversal operator

The metadata field is an instance of google.protobuf.Struct whose format is defined in the schema specified in the schema_title field. google.protobuf.Struct is a data structure that maps keys to google.protobuf.Value instances. The google.protobuf.Value data structure stores values in different fields depending on their data type. For example: strings are stored as metadata.FIELD_NAME.string_value, numbers are stored as metadata.FIELD_NAME.number_value, and booleans are stored as metadata.FIELD_NAME.bool_value.

To filter on metadata, you must use the traversal operator to traverse to the field that you want to filter on. The traversal operator uses the following format.

"metadata.FIELD_NAME.TYPE_NAME=\"FILTER_VALUE\""

For example, consider a metadata structure like the following:

{
   "field_1": 5,
   "field_2": "example",
   "field_3": {
     ...
   },
   "field_4": [],
   "field_5": true,
}

The following queries illustrate how to use the traversal operator to filter on this example metadata.

  • Filter for records that have metadata.field_1 with a value less than 5.

    "metadata.field_1.number_value<5"

  • Filter for records that have metadata.field_2 with a value equal to example.

    "metadata.field_2.string_value=\"test\""

  • Filter for records that have metadata.field_5 with a value equal to true.

    "metadata.field_5.bool_value=true"

Filter contexts by their parent/child relationships

You can use the has operator to find contexts that are the parent or child of a specified context.

The has operator uses the following format:

  • "parent_contexts:\"CONTEXT_RESOURCE_NAME\""
  • "child_contexts:\"CONTEXT_RESOURCE_NAME\""

The context name must be the context's full resource name, like the following: project/PROJECT/locations/LOCATION/metadataStores/METADATA-STORE/contexts/CONTEXT.

The following filters demonstrate how to use the has operator:

  • Filter for all contexts that are children of the specified pipeline.

    "parent_contexts: \"project/12345/locations/us-central1/metadataStores/default/contexts/pipeline_1\"".

  • Filter for all contexts that are a parent of the specified pipeline.

    "child_contexts: \"project/12345/locations/us-central1/metadataStores/default/contexts/pipeline_1\"".

Filter for artifacts and executions in a context

You can use the in_context() function to filter for artifacts or executions that are associated with a context.

The in_context() function uses the following format.

"in_context(\"CONTEXT_RESOURCE_NAME\")"

The context name must be the context's full resource name, like the following.

project/PROJECT/locations/LOCATION/metadataStores/METADATA-STORE/contexts/CONTEXT

The following example demonstrates how to filter for objects that are in the specified pipeline.

"in_context(\"project/12345/locations/us-central1/metadataStores/default/contexts/pipeline_1\")"

Logical Operators

You can use AND and OR logical operators to combine filters to create a complex query.

The following example demonstrates how to query for artifacts of type ai_platform.model and a metadata field precision with a numeric value greater than 0.9.

"schema_title=\"ai_platform.Model\" AND metadata.precision.number_value>0.9"

Query for artifacts

Artifacts represent data used or produced by your ML workflow, such as datasets and models. Use the following instructions to query for artifacts.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the artifact is created. The default metadata store is named default.
  • PAGE_SIZE: (Optional.) The maximum number of artifacts to return. If this value is not specified, a maximum of 100 records are returned.
  • PAGE_TOKEN: (Optional.) A page token from a previous MetadataService.ListArtifacts call. Specify this token to get the next page of results.
  • FILTER: Specifies the conditions required to include an artifact in the result set.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/artifacts?pageSize=PAGE_SIZE&pageToken=PAGE_TOKEN&filter=FILTER

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "artifacts": [
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact",
      "displayName": "Example artifact",
      "uri": "gs://your_bucket_name/artifacts/dataset.csv",
      "etag": "67891011",
      "createTime": "2021-05-18T00:33:13.833Z",
      "updateTime": "2021-05-18T00:33:13.833Z",
      "state": "LIVE",
      "schemaTitle": "system.Dataset",
      "schemaVersion": "0.0.1",
      "metadata": {
        "payload_format": "CSV"
      },
      "description": "Description of the example artifact."
    },
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact-2",
      "displayName": "Another example artifact",
      "uri": "gs://your_bucket_name/artifacts/dataset-2.csv",
      "etag": "67891012",
      "createTime": "2021-05-18T00:29:24.344Z",
      "updateTime": "2021-05-18T00:29:24.344Z",
      "state": "LIVE",
      "schemaTitle": "system.Dataset",
      "schemaVersion": "0.0.1",
      "metadata": {
        "payload_format": "CSV"
      },
      "description": "Description of the other example artifact."
    }
  ]
}

Query for executions

Executions represent a step in your ML workflow, such as preprocessing data or training a model. Use the following instructions to query for executions.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the execution is created. The default metadata store is named default.
  • PAGE_SIZE: (Optional.) The maximum number of executions to return. If this value is not specified, a maximum of 100 records are returned.
  • PAGE_TOKEN: (Optional.) A page token from a previous MetadataService.ListExecutions call. Specify this token to get the next page of results.
  • FILTER: Specifies the conditions required to include an execution in the result set.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/executions?pageSize=PAGE_SIZE&pageToken=PAGE_TOKEN&filter=FILTER

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "executions": [
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/executions/example-execution-1",
      "displayName": "Example execution 1",
      "etag": "67891011",
      "createTime": "2021-05-18T00:06:56.177Z",
      "updateTime": "2021-05-18T00:06:56.177Z",
      "schemaTitle": "system.Run",
      "schemaVersion": "0.0.1",
      "metadata": {},
      "description": "Descrption of the example execution."
    },
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/executions/example-execution-2",
      "displayName": "Example execution 2",
      "etag": "67891011",
      "createTime": "2021-05-18T00:04:49.659Z",
      "updateTime": "2021-05-18T00:04:49.659Z",
      "schemaTitle": "system.Run",
      "schemaVersion": "0.0.1",
      "metadata": {},
      "description": "Descrption of the example execution."
    }
  ]
}

Query for contexts

Contexts let you group sets of executions, artifacts, and other contexts. Use the following instructions to query for contexts.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the context is created. The default metadata store is named default.
  • PAGE_SIZE: (Optional.) The maximum number of contexts to return. If this value is not specified, a maximum of 100 records are returned.
  • PAGE_TOKEN: (Optional.) A page token from a previous MetadataService.ListExecutions call. Specify this token to get the next page of results.
  • FILTER: Specifies the conditions required to include a context in the result set.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/contexts?pageSize=PAGE_SIZE&pageToken=PAGE_TOKEN&filter=FILTER

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "contexts": [
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/contexts/experiment-1",
      "displayName": "Experiment 1",
      "etag": "67891011",
      "createTime": "2021-05-18T22:36:02.153Z",
      "updateTime": "2021-05-18T22:36:02.153Z",
      "parentContexts": [],
      "schemaTitle": "system.Experiment",
      "schemaVersion": "0.0.1",
      "metadata": {}
    },
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/contexts/pipeline-run-1",
      "displayName": "Pipeline run 1",
      "etag": "67891011",
      "createTime": "2021-05-18T22:35:02.600Z",
      "updateTime": "2021-05-18T22:35:02.600Z",
      "parentContexts": [],
      "schemaTitle": "system.PipelineRun",
      "schemaVersion": "0.0.1",
      "metadata": {}
    }
  ]
}

Query for an execution's input and output artifacts

Use the following instructions to query for the artifacts and executions in the specified context, along with the events that connect artifacts to executions.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the execution is created. The default metadata store is named default.
  • EXECUTION_ID: The ID of the execution record.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/executions/EXECUTION:queryExecutionInputsAndOutputs

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "artifacts": [
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact-1",
      "displayName": "Example artifact",
      "uri": "gs://your_bucket_name/artifacts/dataset.csv",
      "etag": "678901011",
      "createTime": "2021-05-18T00:29:24.344Z",
      "updateTime": "2021-05-18T00:29:24.344Z",
      "state": "LIVE",
      "schemaTitle": "system.Dataset",
      "schemaVersion": "0.0.1",
      "metadata": {
        "payload_format": "CSV"
      },
      "description": "Description of the example artifact."
    },
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact-2",
      "displayName": "Example artifact 2",
      "uri": "gs://your_bucket_name/artifacts/dataset.csv",
      "etag": "678901011",
      "createTime": "2021-05-18T00:33:13.833Z",
      "updateTime": "2021-05-18T00:33:13.833Z",
      "state": "LIVE",
      "schemaTitle": "system.Dataset",
      "schemaVersion": "0.0.1",
      "metadata": {
        "payload_format": "CSV"
      },
      "description": "Description of the example artifact."
    }
  ],
  "executions": [
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/executions/example-execution-1",
      "displayName": "Example execution 1",
      "etag": "678901011",
      "createTime": "2021-05-18T00:04:49.659Z",
      "updateTime": "2021-05-18T00:04:49.659Z",
      "schemaTitle": "system.Run",
      "schemaVersion": "0.0.1",
      "metadata": {},
      "description": "Description of the example execution."
    }
  ],
  "events": [
    {
      "artifact": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact-1",
      "execution": "projects/12345/locations/us-central1/metadataStores/default/executions/example-execution-1",
      "eventTime": "2021-05-18T00:04:49.659Z",,
      "type": "INPUT",
    },
    {
      "artifact": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact-2",
      "execution": "projects/12345/locations/us-central1/metadataStores/default/executions/example-execution-1",
      "eventTime": "2021-05-18T00:04:49.659Z",,
      "type": "OUTPUT",
    }
  ]
}

Query for a context's lineage subgraph

Use the following instructions to query for the artifacts and executions in the specified context, along with the events that connect artifacts to executions.

Before using any of the request data, make the following replacements:

  • LOCATION: Your region.
  • PROJECT: Your project ID or project number.
  • METADATA_STORE: The metadata store ID where the execution is created. The default metadata store is named default.
  • CONTEXT_ID: The ID of the context record.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT/locations/LOCATION/metadataStores/METADATA_STORE/contexts/CONTEXT_ID:queryContextLineageSubgraph

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "artifacts": [
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact-1",
      "displayName": "Example artifact",
      "uri": "gs://your_bucket_name/artifacts/dataset.csv",
      "etag": "678901011",
      "createTime": "2021-05-18T00:29:24.344Z",
      "updateTime": "2021-05-18T00:29:24.344Z",
      "state": "LIVE",
      "schemaTitle": "system.Dataset",
      "schemaVersion": "0.0.1",
      "metadata": {
        "payload_format": "CSV"
      },
      "description": "Description of the example artifact."
    },
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact-2",
      "displayName": "Example artifact 2",
      "uri": "gs://your_bucket_name/artifacts/dataset.csv",
      "etag": "678901011",
      "createTime": "2021-05-18T00:33:13.833Z",
      "updateTime": "2021-05-18T00:33:13.833Z",
      "state": "LIVE",
      "schemaTitle": "system.Dataset",
      "schemaVersion": "0.0.1",
      "metadata": {
        "payload_format": "CSV"
      },
      "description": "Description of the example artifact."
    }
  ],
  "executions": [
    {
      "name": "projects/12345/locations/us-central1/metadataStores/default/executions/example-execution-1",
      "displayName": "Example execution 1",
      "etag": "678901011",
      "createTime": "2021-05-18T00:04:49.659Z",
      "updateTime": "2021-05-18T00:04:49.659Z",
      "schemaTitle": "system.Run",
      "schemaVersion": "0.0.1",
      "metadata": {},
      "description": "Description of the example execution."
    }
  ],
  "events": [
    {
      "artifact": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact-1",
      "execution": "projects/12345/locations/us-central1/metadataStores/default/executions/example-execution-1",
      "eventTime": "2021-05-18T00:04:49.659Z",,
      "type": "INPUT",
    },
    {
      "artifact": "projects/12345/locations/us-central1/metadataStores/default/artifacts/example-artifact-2",
      "execution": "projects/12345/locations/us-central1/metadataStores/default/executions/example-execution-1",
      "eventTime": "2021-05-18T00:04:49.659Z",,
      "type": "OUTPUT",
    }
  ]
}

What's next