Create a feature group

You can create a feature group to register a BigQuery table or view that contains your feature data.

For any BigQuery table or view that you associate with a feature group, you need to ensure the following:

  • The schema of the data source conforms to the Data source preparation guidelines.

  • The data source contains the entity IDs as string values in a column named entity_id.

  • The data source contains the feature timestamps of type timestamp in a column called feature_timestamp.

After you create a feature group and associate the BigQuery data source, you can create features to associate with the columns in the data source. It's optional to specify a data source while creating the feature group. However, you need to specify a data source before you create features.

Registering your data source using feature groups and features has the following advantages:

  • You can define a feature view for online serving by using specific feature columns from multiple BigQuery data sources.

  • You can format your data as a time series by including the feature_timestamp column. Vertex AI Feature Store serves only the latest feature values from the feature data and excludes historical values.

Use the following sample to create a feature group and associate a BigQuery data source.

Console

Use the following instructions to create a feature group using the Google Cloud console.

  1. In the Vertex AI section of the Google Cloud console, go to the Feature Store page.

    Go to the Feature Store page

  2. In the Feature groups section, click Create to open the Basic info pane on the Create Feature Group page.

  3. Specify the Feature group name.

  4. Optional: To add labels, click Add label, and specify the label name and value. You can add multiple labels to a feature group.

  5. In the BigQuery path field, click Browse to select BigQuery source table or view to associate with the feature group.

  6. Optional: In the Entity ID column list, click the entity ID column from the BigQuery source table or view.

  7. Click Continue.

  8. In the Register pane, click one of the following options to indicate whether you want to add features to the new feature group:

    • Include all columns from the BigQuery table—Create features within the feature group for all the columns in the BigQuery source table or view.

    • Manually enter your features—Create features based on specific columns in the BigQuery source. For each feature, enter a Feature name and click the corresponding BigQuery source column name in the list.

      To add more features, click Add another feature.

    • Create an empty feature group—Create the feature group without adding features to it.

  9. Click Create.

REST

To create a FeatureGroup resource, send a POST request by using the featureGroups.create method.

Before using any of the request data, make the following replacements:

  • LOCATION_ID: Region where you want to create the feature group, such as us-central1.
  • PROJECT_ID: Your project ID.
  • FEATUREGROUP_NAME: The name of the new feature group that you want to create.
  • BIGQUERY_SOURCE_URI: URI of the BigQuery source table or view that you want to register for the feature group.

HTTP method and URL:

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups?feature_group_id=FEATUREGROUP_NAME

Request JSON body:

{
  "big_query": {
    "big_query_source": {
      "input_uri": "BIGQUERY_SOURCE_URI"
    }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups?feature_group_id=FEATUREGROUP_NAME"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featureGroups?feature_group_id=FEATUREGROUP_NAME" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featureGroups/FEATUREGROUP_NAME/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.aiplatform.v1.UpdateFeatureGroupOperationMetadata",
    "genericMetadata": {
      "createTime": "2023-09-18T03:00:13.060636Z",
      "updateTime": "2023-09-18T03:00:13.060636Z"
    }
  }
}

What's next