Export feature values

Stay organized with collections Save and categorize content based on your preferences.

Export feature values for all entities of a single entity type to a BigQuery table or a Cloud Storage bucket. You can choose to get a snapshot or to fully export feature values. A snapshot returns a single value per feature compared to a full export, which can return multiple values per feature. You cannot select particular entity IDs or include multiple entity types when exporting feature values.

Exporting feature values is useful for archiving or for performing ad hoc analysis on your data. For example, you can store regular snapshots of your featurestore to save its state at different points in time. If you need to get feature values for building a training dataset, use batch serving instead.

Snapshot and full export comparison

Both the snapshot and full export options let you query data by specifying a single timestamp (either the start time or end time) or both timestamps. For snapshots, Vertex AI Feature Store returns the latest feature value within a given time range. In the output, the associated timestamp with each feature value is the snapshot timestamp (not the feature value timestamp).

For full exports, Vertex AI Feature Store returns all feature values within a given time range. In the output, the associated timestamp with each feature value is the feature timestamp (the specified timestamp when the feature value was ingested).

The following table summarizes what Vertex AI Feature Store returns based on the option that you choose and the timestamps that you provide.

Option Start time only (inclusive) End time only (inclusive) Start and end time (inclusive)
Snapshot Starting with the current time (when the request was received), returns the latest value, looking back until the start time.
The snapshot timestamp is set to the current time.
Starting with the end time, returns the latest value, looking back to the very first value for each feature.
The snapshot timestamp is set to the specified end time.
Returns the latest value within the specified time range.
The snapshot timestamp is set to the specified end time.
Full export Returns all values on and after the start time and up to the current time (when the request was sent). Returns all values up to the end time, going all the way back to the very first value for each feature. Returns all values within the specified time range.

Null values

For snapshots, if the latest feature value is null at a given timestamp, Vertex AI Feature Store returns the previous non-null feature value. If there are no previous non-null values, Vertex AI Feature Store returns null.

For full exports, if a feature value is null at a given timestamp, Vertex AI Feature Store returns null for that timestamp.

Examples

As an example, assume you had the following values in a featurestore, where the values for Feature_A and Feature_B share the same timestamp:

Entity ID Feature value timestamp Feature_A Feature_B
123 T1 A_T1 B_T1
123 T2 A_T2 NULL
123 T3 A_T3 NULL
123 T4 A_T4 B_T4
123 T5 NULL B_T5

Snapshot

For snapshots, Vertex AI Feature Store returns the following values based on the given timestamp values:

  • If only the start time is set to T3, the snapshot returns the following values:
Entity ID Snapshot timestamp Feature_A Feature_B
123 CURRENT_TIME A_T4 B_T5
  • If only the end time is set to T3, the snapshot returns the following values:
Entity ID Snapshot timestamp Feature_A Feature_B
123 T3 A_T3 B_T1
  • If the start and end times are set to T2 and T3, the snapshot returns the following values:
Entity ID Snapshot timestamp Feature_A Feature_B
123 T3 A_T3 NULL

Full export

For full exports, Vertex AI Feature Store returns the following values based on the given timestamp values:

  • If only the start time is set to T3, the full export returns the following values:
Entity ID Feature value timestamp Feature_A Feature_B
123 T3 A_T3 NULL
123 T4 A_T4 B_T4
123 T5 NULL B_T5
  • If only the end time is set to T3, the full export returns the following values:
Entity ID Feature value timestamp Feature_A Feature_B
123 T1 A_T1 B_T1
123 T2 A_T2 NULL
123 T3 A_T3 NULL
  • If the start and end times are set to T2 and T4, the full export returns the following values:
Entity ID Feature value timestamp Feature_A Feature_B
123 T2 A_T2 NULL
123 T3 A_T3 NULL
123 T4 A_T4 B_T4

Export feature values

When you export feature values, you choose which features to query and whether it is a snapshot or a full export. The following sections show a sample for each option.

For both options, the output destination must be in the same region as the source featurestore. For example, if your featurestore is in us-central1, then the destination Cloud Storage bucket or BigQuery table must also be in us-central1.

Snapshot

Export the latest feature values for a given time range.

Web UI

Use another method. You cannot export feature values from the Google Cloud console.

REST & CMD LINE

To export feature values, send a POST request by using the entityTypes.exportFeatureValues method.

The following sample outputs a BigQuery table, but you can also output to a Cloud Storage bucket. Each output destination might have some prerequisites before you can submit a request. For example, if you specify a table name for the bigqueryDestination field, you must have an existing dataset. These requirements are documented in the API reference.

Before using any of the request data, make the following replacements:

  • LOCATION: Region where the featurestore is located. For example, us-central1.
  • PROJECT: Your project ID.
  • FEATURESTORE_ID: ID of the featurestore.
  • ENTITY_TYPE_ID: ID of the entity type.
  • START_TIME and END_TIME: (Optional) If you specify the start time only, returns the latest value starting from the current time (when the request is sent) and looking back until the start time. If you specify the end time only, returns the latest value starting from the end time (inclusive) and looking back to the very first value. If you specify a start time and end time, returns the latest value within the specified time range (inclusive). If you specify neither, returns the latest values for each feature, starting from the current time and looking back to the very first value.
  • DATASET_NAME: Name of the destination BigQuery dataset.
  • TABLE_NAME: Name of the destination BigQuery table.
  • FEATURE_ID: ID of one or more features. Specify a single * (asterisk) to select all features.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues

Request JSON body:

{
  "snapshotExport": {
    "start_time": "START_TIME",
    "snapshot_time": "END_TIME"
  },
  "destination" : {
    "bigqueryDestination": {
      "outputUri": "bq://PROJECT.DATASET_NAME.TABLE_NAME"
    }
  },
  "featureSelector": {
    "idMatcher": {
      "ids": ["FEATURE_ID", ...]
    }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues"

PowerShell

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.ExportFeatureValuesOperationMetadata",
    "genericMetadata": {
      "createTime": "2021-12-03T22:55:25.974976Z",
      "updateTime": "2021-12-03T22:55:25.974976Z"
    }
  }
}

Additional languages

You can install and use the following Vertex AI client libraries to call the Vertex AI API. Cloud Client Libraries provide an optimized developer experience by using each supported language's natural conventions and styles.

Full export

Export all feature values within a given time range.

Web UI

Use another method. You cannot export feature values from the Google Cloud console.

REST & CMD LINE

To export feature values, send a POST request by using the entityTypes.exportFeatureValues method.

The following sample outputs a BigQuery table, but you can also output to a Cloud Storage bucket. Each output destination might have some prerequisites before you can submit a request. For example, if you specify a table name for the bigqueryDestination field, you must have an existing dataset. These requirements are documented in the API reference.

Before using any of the request data, make the following replacements:

  • LOCATION: Region where the featurestore is located. For example, us-central1.
  • PROJECT: Your project ID.
  • FEATURESTORE_ID: ID of the featurestore.
  • ENTITY_TYPE_ID: ID of the entity type.
  • START_TIME and END_TIME: (Optional) If you specify the start time only, returns all values between the current time (when the request is sent) and the start time (inclusive). If you specify the end time only, returns all values between the end time (inclusive) and the very first value timestamp (for each feature). If you specify a start time and end time, returns all values within the specified time range (inclusive). If you specify neither, returns all values between the current time and the very first value timestamp (for each feature).
  • DATASET_NAME: Name of the destination BigQuery dataset.
  • TABLE_NAME: Name of the destination BigQuery table.
  • FEATURE_ID: ID of one or more features. Specify a single * (asterisk) to select all features.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues

Request JSON body:

{
  "fullExport": {
    "start_time": "START_TIME",
    "end_time": "END_TIME"
  },
  "destination" : {
    "bigqueryDestination": {
      "outputUri": "bq://PROJECT.DATASET_NAME.TABLE_NAME"
    }
  },
  "featureSelector": {
    "idMatcher": {
      "ids": ["FEATURE_ID", ...]
    }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues"

PowerShell

Save the request body in a file called request.json, and execute the following command:

$cred = gcloud auth application-default print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.ExportFeatureValuesOperationMetadata",
    "genericMetadata": {
      "createTime": "2021-12-03T22:55:25.974976Z",
      "updateTime": "2021-12-03T22:55:25.974976Z"
    }
  }
}

Additional languages

You can install and use the following Vertex AI client libraries to call the Vertex AI API. Cloud Client Libraries provide an optimized developer experience by using each supported language's natural conventions and styles.

What's next