Export feature values for all entities of a single entity type to a BigQuery table or a Cloud Storage bucket. You can choose to get a snapshot or to fully export feature values. A snapshot returns a single value per feature compared to a full export, which can return multiple values per feature. You can't select particular entity IDs or include multiple entity types when exporting feature values.
Exporting feature values is useful for archiving or for performing ad hoc analysis on your data. For example, you can store regular snapshots of your featurestore to save its state at different points in time. If you need to get feature values for building a training dataset, use batch serving instead.
Snapshot and full export comparison
Both the snapshot and full export options let you query data by specifying a single timestamp (either the start time or end time) or both timestamps. For snapshots, Vertex AI Feature Store (Legacy) returns the latest feature value within a given time range. In the output, the associated timestamp with each feature value is the snapshot timestamp (not the feature value timestamp).
For full exports, Vertex AI Feature Store (Legacy) returns all feature values within a given time range. In the output, the associated timestamp with each feature value is the feature timestamp (the specified timestamp when the feature value was ingested).
The following table summarizes what Vertex AI Feature Store (Legacy) returns based on the option that you choose and the timestamps that you provide.
Option | Start time only (inclusive) | End time only (inclusive) | Start and end time (inclusive) |
---|---|---|---|
Snapshot | Starting with the current time (when the request was received), returns
the latest value, looking back until the start time. The snapshot timestamp is set to the current time. |
Starting with the end time, returns the latest value, looking back to
the very first value for each feature. The snapshot timestamp is set to the specified end time. |
Returns the latest value within the specified time range. The snapshot timestamp is set to the specified end time. |
Full export | Returns all values on and after the start time and up to the current time (when the request was sent). | Returns all values up to the end time, going all the way back to the very first value for each feature. | Returns all values within the specified time range. |
Null values
For snapshots, if the latest feature value is null at a given timestamp, Vertex AI Feature Store (Legacy) returns the previous non-null feature value. If there are no previous non-null values, Vertex AI Feature Store (Legacy) returns null.
For full exports, if a feature value is null at a given timestamp, Vertex AI Feature Store (Legacy) returns null for that timestamp.
Examples
As an example, assume you had the following values in a featurestore, where the
values for Feature_A
and Feature_B
share the same timestamp:
Entity ID | Feature value timestamp | Feature_A | Feature_B |
---|---|---|---|
123 | T1 | A_T1 | B_T1 |
123 | T2 | A_T2 | NULL |
123 | T3 | A_T3 | NULL |
123 | T4 | A_T4 | B_T4 |
123 | T5 | NULL | B_T5 |
Snapshot
For snapshots, Vertex AI Feature Store (Legacy) returns the following values based on the given timestamp values:
- If only the start time is set to
T3
, the snapshot returns the following values:
Entity ID | Snapshot timestamp | Feature_A | Feature_B |
---|---|---|---|
123 | CURRENT_TIME | A_T4 | B_T5 |
- If only the end time is set to
T3
, the snapshot returns the following values:
Entity ID | Snapshot timestamp | Feature_A | Feature_B |
---|---|---|---|
123 | T3 | A_T3 | B_T1 |
- If the start and end times are set to
T2
andT3
, the snapshot returns the following values:
Entity ID | Snapshot timestamp | Feature_A | Feature_B |
---|---|---|---|
123 | T3 | A_T3 | NULL |
Full export
For full exports, Vertex AI Feature Store (Legacy) returns the following values based on the given timestamp values:
- If only the start time is set to
T3
, the full export returns the following values:
Entity ID | Feature value timestamp | Feature_A | Feature_B |
---|---|---|---|
123 | T3 | A_T3 | NULL |
123 | T4 | A_T4 | B_T4 |
123 | T5 | NULL | B_T5 |
- If only the end time is set to
T3
, the full export returns the following values:
Entity ID | Feature value timestamp | Feature_A | Feature_B |
---|---|---|---|
123 | T1 | A_T1 | B_T1 |
123 | T2 | A_T2 | NULL |
123 | T3 | A_T3 | NULL |
- If the start and end times are set to
T2
andT4
, the full export returns the following values:
Entity ID | Feature value timestamp | Feature_A | Feature_B |
---|---|---|---|
123 | T2 | A_T2 | NULL |
123 | T3 | A_T3 | NULL |
123 | T4 | A_T4 | B_T4 |
Export feature values
When you export feature values, you choose which features to query and whether it is a snapshot or a full export. The following sections show a sample for each option.
For both options, the output destination must be in the same region as the
source featurestore. For example, if your featurestore is in us-central1
,
then the destination Cloud Storage bucket or BigQuery
table must also be in us-central1
.
Snapshot
Export the latest feature values for a given time range.
Web UI
Use another method. You cannot export feature values from the Google Cloud console.
REST
To export feature values, send a POST request by using the entityTypes.exportFeatureValues method.
The following sample outputs a BigQuery table, but you can also output
to a Cloud Storage bucket. Each output destination might have some
prerequisites before you can submit a request. For example, if you specify a
table name for the bigqueryDestination
field, you must have an
existing dataset. These requirements are documented in the API reference.
Before using any of the request data, make the following replacements:
- LOCATION_ID: Region where the featurestore is located. For example,
us-central1
. - PROJECT_ID: Your project ID.
- FEATURESTORE_ID: ID of the featurestore.
- ENTITY_TYPE_ID: ID of the entity type.
- START_TIME and END_TIME: (Optional) If you specify the start time only, returns the latest value starting from the current time (when the request is sent) and looking back until the start time. If you specify the end time only, returns the latest value starting from the end time (inclusive) and looking back to the very first value. If you specify a start time and end time, returns the latest value within the specified time range (inclusive). If you specify neither, returns the latest values for each feature, starting from the current time and looking back to the very first value.
- DATASET_NAME: Name of the destination BigQuery dataset.
- TABLE_NAME: Name of the destination BigQuery table.
- FEATURE_ID: ID of one or more features. Specify a single
*
(asterisk) to select all features.
HTTP method and URL:
POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues
Request JSON body:
{ "snapshotExport": { "start_time": "START_TIME", "snapshot_time": "END_TIME" }, "destination" : { "bigqueryDestination": { "outputUri": "bq://PROJECT_ID.DATASET_NAME.TABLE_NAME" } }, "featureSelector": { "idMatcher": { "ids": ["FEATURE_ID", ...] } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.ExportFeatureValuesOperationMetadata", "genericMetadata": { "createTime": "2021-12-03T22:55:25.974976Z", "updateTime": "2021-12-03T22:55:25.974976Z" } } }
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Additional languages
To learn how to install and use the Vertex AI SDK for Python, see Use the Vertex AI SDK for Python. For more information, see the Vertex AI SDK for Python API reference documentation.
Full export
Export all feature values within a given time range.
Web UI
Use another method. You cannot export feature values from the Google Cloud console.
REST
To export feature values, send a POST request by using the entityTypes.exportFeatureValues method.
The following sample outputs a BigQuery table, but you can also output
to a Cloud Storage bucket. Each output destination might have some
prerequisites before you can submit a request. For example, if you specify a
table name for the bigqueryDestination
field, you must have an
existing dataset. These requirements are documented in the API reference.
Before using any of the request data, make the following replacements:
- LOCATION_ID: Region where the featurestore is located. For example,
us-central1
. - PROJECT_ID: Your project ID.
- FEATURESTORE_ID: ID of the featurestore.
- ENTITY_TYPE_ID: ID of the entity type.
- START_TIME and END_TIME: (Optional) If you specify the start time only, returns all values between the current time (when the request is sent) and the start time (inclusive). If you specify the end time only, returns all values between the end time (inclusive) and the very first value timestamp (for each feature). If you specify a start time and end time, returns all values within the specified time range (inclusive). If you specify neither, returns all values between the current time and the very first value timestamp (for each feature).
- DATASET_NAME: Name of the destination BigQuery dataset.
- TABLE_NAME: Name of the destination BigQuery table.
- FEATURE_ID: ID of one or more features. Specify a single
*
(asterisk) to select all features.
HTTP method and URL:
POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues
Request JSON body:
{ "fullExport": { "start_time": "START_TIME", "end_time": "END_TIME" }, "destination" : { "bigqueryDestination": { "outputUri": "bq://PROJECT.DATASET_NAME.TABLE_NAME" } }, "featureSelector": { "idMatcher": { "ids": ["FEATURE_ID", ...] } } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID:exportFeatureValues" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1.ExportFeatureValuesOperationMetadata", "genericMetadata": { "createTime": "2021-12-03T22:55:25.974976Z", "updateTime": "2021-12-03T22:55:25.974976Z" } } }
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Additional languages
To learn how to install and use the Vertex AI SDK for Python, see Use the Vertex AI SDK for Python. For more information, see the Vertex AI SDK for Python API reference documentation.
What's next
- Learn how to batch ingest feature values.
- Learn how to serve features through online serving.
- View the Vertex AI Feature Store (Legacy) concurrent batch job quota.
- Troubleshoot common Vertex AI Feature Store (Legacy) issues.