Imports feature values into the Featurestore from a source storage.
The progress of the import is tracked by the returned operation. The imported features are guaranteed to be visible to subsequent read operations after the operation is marked as successfully done.
If an import operation fails, the feature values returned from reads and exports may be inconsistent. If consistency is required, the caller must retry the same import request again and wait till the new operation returned is marked as successfully done.
There are also scenarios where the caller can cause inconsistency.
- Source data for import contains multiple distinct feature values for the same entity id and timestamp.
- Source is modified during an import. This includes adding, updating, or removing source data and/or metadata. Examples of updating metadata include but are not limited to changing storage location, storage class, or retention policy.
- Online serving cluster is under-provisioned.
Endpoint
post
https://{service-endpoint}/v1/{entityType}:importFeatureValues
Where {service-endpoint}
is one of the supported service endpoints.
Path parameters
entityType
string
Required. The resource name of the EntityType grouping the Features for which values are being imported. Format: projects/{project}/locations/{location}/featurestores/{featurestore}/entityTypes/{entityType}
Request body
The request body contains data with the following structure:
entityIdField
string
Source column that holds entity IDs. If not provided, entity IDs are extracted from the column named entity_id.
Required. Specifications defining which feature values to import from the entity. The request fails if no featureSpecs are provided, and having multiple featureSpecs for one feature is not allowed.
disableOnlineServing
boolean
If set, data will not be imported for online serving. This is typically used for backfilling, where feature generation timestamps are not in the timestamp range needed for online serving.
workerCount
integer
Specifies the number of workers that are used to write data to the Featurestore. Consider the online serving capacity that you require to achieve the desired import throughput without interfering with online serving. The value must be positive, and less than or equal to 100. If not set, defaults to using 1 worker. The low count ensures minimal impact on online serving performance.
disableIngestionAnalysis
boolean
If true, API doesn't start ingestion analysis pipeline.
source
. Details about the source data, including the location of the storage and the format. source
can be only one of the following:feature_time_source
. Source of Feature timestamp for all Feature values of each entity. Timestamps must be millisecond-aligned. feature_time_source
can be only one of the following:featureTimeField
string
Source column that holds the feature timestamp for all feature values in each entity.
Single feature timestamp for all entities being imported. The timestamp must not have higher than millisecond precision.
A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z"
and "2014-10-02T15:01:23.045123456Z"
.
Response body
If successful, the response body contains an instance of Operation
.
AvroSource
FeatureSpec
Defines the feature value(s) to import.
id
string
Required. id of the feature to import values of. This feature must exist in the target EntityType, or the request will fail.
sourceField
string
Source column to get the feature values from. If not set, uses the column with the same name as the feature id.
JSON representation |
---|
{ "id": string, "sourceField": string } |