Create a feature view instance

You can create a feature view within an existing online store instance. While creating a feature view, you can associate features with it in the following ways:

  • Add feature groups and features from Feature Registry: Associate with existing feature groups and features from the Feature Registry. A feature group specifies the location of the BigQuery data source. A feature within the feature group points to a specific feature column within that data source. You can associate a feature view with multiple feature groups.

  • Add features from a BigQuery source: Directly associate a BigQuery data source, such as a BigQuery table or view, and specify the entity ID column.

After you create a feature view, Vertex AI Feature Store syncs the latest feature values from the BigQuery data source. If you set the query parameter run_sync_immediately=true, then Vertex AI Feature Store syncs the feature values when you create the feature view. Otherwise, Vertex AI Feature Store syncs the feature values according to the sync schedule specified for the feature view.

Sync feature data in a feature view

Vertex AI Feature Store periodically refreshes, or syncs, the feature values stored in the online store from BigQuery. When you create a feature view, you can specify the schedule or frequency for the data sync using the FeatureView.sync_config parameter.

You also have the option to manually trigger a data sync for a feature view. For more information about how to manually sync data for a feature view, see Sync feature data to online store.

Note that only one data sync operation can be active at a time for a feature view. If a data sync is in progress for a feature view, all scheduled data syncs for that feature view are skipped until the sync is completed.

Optimize costs during sync

A data sync operation might involve costs for BigQuery resource usage. Follow these guidelines to optimize these costs and improve performance during a data sync:

  • Don't configure the sync schedule to run more frequently than the frequency at which the data is expected to change in the BigQuery source.

  • Optimize the size of the feature data source in BigQuery. While creating the feature view, only include the data that you need for online serving.

  • Avoid running complex aggregations in BigQuery. Run a SELECT * query on the table or view to estimate the volume and duration of data processing.

  • While setting the scaling options for the online store, set max_node_count to a value that's high enough to cover high loads during a data sync.

  • Schedule the sync for different feature views at different times within the same online store.

  • If your BigQuery table contains extensive historical data, consider partitioning the table using timestamps and specify a time range for retrieving the feature data. This minimizes the retrieval of obsolete feature data during sync.

  • Bigtable utilization increases during data syncs. For feature views created within online stores for Bigtable online serving, schedule sync jobs during off-peak times for best performance.

Configure the service account for a feature view

Each feature view uses a service account to access the source data in BigQuery during sync. Vertex AI Feature Store assigns the BigQuery Data Viewer Identity and Access Management (IAM) role to this service account.

By default, a feature view uses the service account configured for your project. With this configuration, any user with permission to create a feature view in your project can access the feature data in BigQuery.

Alternatively, you can configure the feature view to use its own service account. Vertex AI Feature Store then sets up a dedicated service account for the feature view. With this configuration, you can restrict access to feature data in BigQuery or grant access to additional users. You can specify the service account configuration by using the FeatureView.service_agent_type parameter.

Configure vector retrieval for a feature view

You can configure vector retrieval for a feature view created based on a if the associated data source contains the embedding column and the online store is configured to support embedding management. You can specify the vector retrieval configuration by using the FeatureView.vector_search_config parameter.

Note that you can configure vector retrieval and manage embeddings only if the feature view is created by specifying a BigQuery source URI and not from feature groups and features from Feature Registry.

For information about how to up the BigQuery data source to support embeddings by including the embedding column, see Data source preparation guidelines.

Create a feature view from feature groups

You can create a feature view based on feature data registered using feature groups and features. To associate multiple BigQuery data sources with the same feature view, you can specify multiple feature groups.

If you create a feature view by specifying feature groups and features:

  • Your data source must have a feature_timestamp column and can contain historical data.

  • Vertex AI Feature Store serves only the latest feature values based on the feature timestamp.

  • You can't configure embedding management for the feature view.

Create a feature view with the default service account configuration

Use the following sample to create a feature view by associating multiple feature groups without specifying a service account configuration.

REST

To create a FeatureView resource, send a POST request by using the featureViews.create method.

Before using any of the request data, make the following replacements:

  • LOCATION_ID: Region where you want to create the feature view, such as us-central1.
  • PROJECT_ID: Your project ID.
  • FEATUREONLINESTORE_NAME: The name of the online store instance where you want to create the feature view.
  • FEATUREVIEW_NAME: The name of the new feature view instance that you want to create.
  • FEATUREGROUP_NAME_A and FEATUREGROUP_NAME_B: The names of the feature groups from which you want to add features to the feature view.
  • FEATURE_ID_A1 and FEATURE_ID_A2: Feature IDs from the feature group FEATUREGROUP_NAME_A that you want to add to the feature view.
  • FEATURE_ID_B1 and FEATURE_ID_B2: Feature IDs from the feature group FEATUREGROUP_NAME_B that you want to add to the feature view.
  • CRON: Cron schedule expression representing the frequency for syncing data to the feature view. For more information, see cron.

HTTP method and URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME

Request JSON body:

{
  "feature_registry_source": {
    "feature_groups": [
      {
        "feature_group_id": "FEATUREGROUP_NAME_A",
        "feature_ids": [ "FEATURE_ID_A1", "FEATURE_ID_A2" ]
      },
      {
        "feature_group_id": "FEATUREGROUP_NAME_B",
        "feature_ids": [ "FEATURE_ID_B1", "FEATURE_ID_B2" ]
      }
    ]
  },
  "sync_config": {
    "cron": "CRON"
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews/FEATUREVIEW_NAME/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateFeatureViewOperationMetadata",
    "genericMetadata": {
      "createTime": "2023-09-15T02:11:29.458820Z",
      "updateTime": "2023-09-15T02:11:29.458820Z"
    }
  }
}

Create a feature view by specifying a service account configuration

Use the following sample to create a feature view from feature groups by specifying a service account configuration.

REST

To create a FeatureView resource, send a POST request by using the featureViews.create method.

Before using any of the request data, make the following replacements:

  • LOCATION_ID: Region where you want to create the feature view, such as us-central1.
  • PROJECT_ID: Your project ID.
  • FEATUREONLINESTORE_NAME: The name of the online store instance where you want to create the feature view.
  • FEATUREVIEW_NAME: The name of the new feature view instance that you want to create.
  • FEATUREGROUP_NAME_A and FEATUREGROUP_NAME_B: The names of the feature groups from which you want to add features to the feature view.
  • FEATURE_ID_A1 and FEATURE_ID_A2: Feature IDs from the feature group FEATUREGROUP_NAME_A that you want to add to the feature view.
  • FEATURE_ID_B1 and FEATURE_ID_B2: Feature IDs from the feature group FEATUREGROUP_NAME_B that you want to add to the feature view.
  • CRON: Cron schedule expression representing the frequency for syncing data to the feature view. For more information, see cron.
  • SERVICE_AGENT_TYPE: (_Optional_) Service account configuration for the feature view. Supported service agent types include the following:
    • SERVICE_AGENT_TYPE_PROJECT: Use the project-level service account for the feature view. This is the default configuration.
    • SERVICE_AGENT_TYPE_FEATURE_VIEW: Set up and use a dedicated service account for the feature view.

HTTP method and URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME

Request JSON body:

{
  "feature_registry_source": {
    "feature_groups": [
      {
        "feature_group_id": "FEATUREGROUP_NAME_A",
        "feature_ids": [ "FEATURE_ID_A1", "FEATURE_ID_A2" ]
      },
      {
        "feature_group_id": "FEATUREGROUP_NAME_B",
        "feature_ids": [ "FEATURE_ID_B1", "FEATURE_ID_B2" ]
      }
    ]
  },
  "sync_config": {
    "cron": "CRON"
  },
  "service_agent_type": "SERVICE_AGENT_TYPE"
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews/FEATUREVIEW_NAME/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1beta1.CreateFeatureViewOperationMetadata",
    "genericMetadata": {
      "createTime": "2023-09-15T02:11:29.458820Z",
      "updateTime": "2023-09-15T02:11:29.458820Z"
    }
  }
}

Create a feature view from a BigQuery source

If you want to serve features online without registering your BigQuery data source using feature groups and features, you can create a feature view by specifying the URI of the BigQuery data source.

If you create a feature view by specifying the data source:

  • You can't include a feature_timestamp column in the BigQuery table or view.

  • You can't include historical feature values in the data source. Every row must contain a unique entity ID.

Create a feature view with the default service account configuration

Use the following sample to create a feature view with embedding support by directly associating a BigQuery data source and without specifying a service account configuration.

REST

To create a FeatureView resource, send a POST request by using the featureViews.create method.

Before using any of the request data, make the following replacements:

  • LOCATION_ID: Region where you want to create the feature view, such as us-central1.
  • PROJECT_ID: Your project ID.
  • FEATUREONLINESTORE_NAME: The name of the online store instance where you want to create the feature view.
  • FEATUREVIEW_NAME: The name of the new feature view that you want to create.
  • PROJECT_NAME: Your project name.
  • DATASET_NAME: Your BigQuery dataset name.
  • TABLE_NAME: The name of the table from your BigQuery dataset.
  • ENTITY_ID_COLUMN: The name of the column containing the entity IDs.
  • CRON: Cron schedule expression representing the frequency for syncing data to the feature view. For more information, see cron.
  • EMBEDDING_COLUMN: The name of the column containing the source data to create the index for vector search. This is required only if you want to manage embeddings with the feature view.
  • FILTER_COLUMN_1 and FILTER_COLUMN_2: Optional: The names of the columns used to filter the vector search results.
  • CROWDING_COLUMN: Optional: The name of the column containing the crowding attributes.
  • EMBEDDING_DIMENSION: Optional: The size, expressed as number of dimensions, of an embedding in the embedding column.

HTTP method and URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME

Request JSON body:

{
  "big_query_source": {
    "uri": "bq://PROJECT_NAME.DATASET_NAME.TABLE_NAME",
    "entity_id_columns": "ENTITY_ID_COLUMN"
  },
  "sync_config": {
    "cron": "CRON"
  },
  "vector_search_config": {
    "embedding_column": "EMBEDDING_COLUMN",
    "filter_columns": ["FILTER_COLUMN_1", "FILTER_COLUMN_2"],
    "crowding_column": "CROWDING_COLUMN",
    "embedding_dimension": EMBEDDING_DIMENSION
    "tree_ah_config": {}
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews/FEATUREVIEW_NAME/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.CreateFeatureViewOperationMetadata",
    "genericMetadata": {
      "createTime": "2023-09-15T02:11:29.458820Z",
      "updateTime": "2023-09-15T02:11:29.458820Z"
    }
  }
}

Create a feature view with embedding management by specifying a service account configuration

Use the following sample to create a feature view with embedding support by directly associating a BigQuery data source and specifying a service account configuration.

REST

To create a FeatureView resource with embeddings support, send a POST request by using the featureViews.create method and specifying the vector search configuration.

Before using any of the request data, make the following replacements:

  • LOCATION_ID: Region where you want to create the feature view, such as us-central1.
  • PROJECT_ID: Your project ID.
  • FEATUREONLINESTORE_NAME: The name of the online store instance where you want to create the feature view.
  • FEATUREVIEW_NAME: The name of the new feature view that you want to create.
  • PROJECT_NAME: Your project name.
  • DATASET_NAME: Your BigQuery dataset name.
  • TABLE_NAME: The name of the table from your BigQuery dataset.
  • ENTITY_ID_COLUMNS: The names of the column(s) containing the entity IDs. You can specify either one column or multiple columns.
    • To specify only one entity ID column, specify the column name in the following format:
      "entity_id_column_name".
    • To specify multiple entity ID columns, specify the column names in the following format:
      ["entity_id_column_1_name", "entity_id_column_2_name", ...].
  • CRON: Cron schedule expression representing the frequency for syncing data to the feature view. For more information, see cron.
  • SERVICE_AGENT_TYPE: Service account configuration for the feature view. Supported service agent types include the following:
    • SERVICE_AGENT_TYPE_PROJECT: Use the project-level service account for the feature view. This is the default configuration.
    • SERVICE_AGENT_TYPE_FEATURE_VIEW: Set up and use a dedicated service account for the feature view.
  • EMBEDDING_COLUMN: The name of the column containing the source data to create the index for vector search. This is required only if you want to manage embeddings with the feature view.
  • FILTER_COLUMN_1 and FILTER_COLUMN_2: Optional: The names of the columns used to filter the vector search results.
  • CROWDING_COLUMN: Optional: The name of the column containing the crowding attributes.
  • EMBEDDING_DIMENSION: Optional: The size, expressed as number of dimensions, of an embedding in the embedding column.

HTTP method and URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME

Request JSON body:

{
  "big_query_source": {
    "uri": "bq://PROJECT_NAME.DATASET_NAME.TABLE_NAME",
    "entity_id_columns": "ENTITY_ID_COLUMNS"
  },
  "sync_config": {
    "cron": "CRON"
  },
  "service_agent_type": "SERVICE_AGENT_TYPE",
  "vector_search_config": {
    "embedding_column": "EMBEDDING_COLUMN",
    "filter_columns": ["FILTER_COLUMN_1", "FILTER_COLUMN_2"],
    "crowding_column": "CROWDING_COLUMN",
    "embedding_dimension": EMBEDDING_DIMENSION
    "tree_ah_config": {}
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews?feature_view_id=FEATUREVIEW_NAME" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureOnlineStores/FEATUREONLINESTORE_NAME/featureViews/FEATUREVIEW_NAME/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1beta1.CreateFeatureViewOperationMetadata",
    "genericMetadata": {
      "createTime": "2023-09-15T02:11:29.458820Z",
      "updateTime": "2023-09-15T02:11:29.458820Z"
    }
  }
}

What's next