Class Featurestore (1.24.0)

Featurestore(
    featurestore_name: str,
    project: Optional[str] = None,
    location: Optional[str] = None,
    credentials: Optional[google.auth.credentials.Credentials] = None,
)

Managed featurestore resource for Vertex AI.

Inheritance

builtins.object > google.cloud.aiplatform.base.VertexAiResourceNoun > builtins.object > google.cloud.aiplatform.base.FutureManager > google.cloud.aiplatform.base.VertexAiResourceNounWithFutureManager > Featurestore

Properties

create_time

Time this resource was created.

display_name

Display name of this resource.

encryption_spec

Customer-managed encryption key options for this Vertex AI resource.

If this is set, then all resources created by this Vertex AI resource will be encrypted with the provided encryption key.

gca_resource

The underlying resource proto representation.

labels

User-defined labels containing metadata about this resource.

Read more about labels at https://goo.gl/xmQnxf

name

Name of this resource.

resource_name

Full qualified resource name.

update_time

Time this resource was last updated.

Methods

Featurestore

Featurestore(
    featurestore_name: str,
    project: Optional[str] = None,
    location: Optional[str] = None,
    credentials: Optional[google.auth.credentials.Credentials] = None,
)

Retrieves an existing managed featurestore given a featurestore resource name or a featurestore ID.

Example Usage:

my_featurestore = aiplatform.Featurestore(
    featurestore_name='projects/123/locations/us-central1/featurestores/my_featurestore_id'
)
or
my_featurestore = aiplatform.Featurestore(
    featurestore_name='my_featurestore_id'
)
Parameters
NameDescription
featurestore_name str

Required. A fully-qualified featurestore resource name or a featurestore ID. Example: "projects/123/locations/us-central1/featurestores/my_featurestore_id" or "my_featurestore_id" when project and location are initialized or passed.

project str

Optional. Project to retrieve featurestore from. If not set, project set in aiplatform.init will be used.

location str

Optional. Location to retrieve featurestore from. If not set, location set in aiplatform.init will be used.

credentials auth_credentials.Credentials

Optional. Custom credentials to use to retrieve this Featurestore. Overrides credentials set in aiplatform.init.

batch_serve_to_bq

batch_serve_to_bq(
    bq_destination_output_uri: str,
    serving_feature_ids: Dict[str, List[str]],
    read_instances_uri: str,
    pass_through_fields: Optional[List[str]] = None,
    feature_destination_fields: Optional[Dict[str, str]] = None,
    start_time: Optional[google.protobuf.timestamp_pb2.Timestamp] = None,
    request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
    serve_request_timeout: Optional[float] = None,
    sync: bool = True,
)

Batch serves feature values to BigQuery destination

Parameters
NameDescription
bq_destination_output_uri str

Required. BigQuery URI to the detination table. .. rubric:: Example 'bq://project.dataset.table_name' It requires an existing BigQuery destination Dataset, under the same project as the Featurestore.

serving_feature_ids Dict[str, List[str]]

Required. A user defined dictionary to define the entity_types and their features for batch serve/read. The keys of the dictionary are the serving entity_type ids and the values are lists of serving feature ids in each entity_type. .. rubric:: Example serving_feature_ids = { 'my_entity_type_id_1': ['feature_id_1_1', 'feature_id_1_2'], 'my_entity_type_id_2': ['feature_id_2_1', 'feature_id_2_2'], }

read_instances_uri str

Required. Read_instances_uri can be either BigQuery URI to an input table, or Google Cloud Storage URI to a csv file. .. rubric:: Example 'bq://project.dataset.table_name' or "gs://my_bucket/my_file.csv" Each read instance should consist of exactly one read timestamp and one or more entity IDs identifying entities of the corresponding EntityTypes whose Features are requested. Each output instance contains Feature values of requested entities concatenated together as of the read time. An example read instance may be foo_entity_id, bar_entity_id, 2020-01-01T10:00:00.123Z. An example output instance may be foo_entity_id, bar_entity_id, 2020-01-01T10:00:00.123Z, foo_entity_feature1_value, bar_entity_feature2_value. Timestamp in each read instance must be millisecond-aligned. The columns can be in any order. Values in the timestamp column must use the RFC 3339 format, e.g. 2012-07-30T10:43:17.123Z.

pass_through_fields List[str]

Optional. When not empty, the specified fields in the read_instances source will be joined as-is in the output, in addition to those fields from the Featurestore Entity. For BigQuery source, the type of the pass-through values will be automatically inferred. For CSV source, the pass-through values will be passed as opaque bytes.

feature_destination_fields Dict[str, str]

Optional. A user defined dictionary to map a feature's fully qualified resource name to its destination field name. If the destination field name is not defined, the feature ID will be used as its destination field name. .. rubric:: Example feature_destination_fields = { 'projects/123/locations/us-central1/featurestores/fs_id/entityTypes/et_id1/features/f_id11': 'foo', 'projects/123/locations/us-central1/featurestores/fs_id/entityTypes/et_id2/features/f_id22': 'bar', }

start_time timestamp_pb2.Timestamp

Optional. Excludes Feature values with feature generation timestamp before this timestamp. If not set, retrieve oldest values kept in Feature Store. Timestamp, if present, must not have higher than millisecond precision.

serve_request_timeout float

Optional. The timeout for the serve request in seconds.

Exceptions
TypeDescription
NotFoundif the BigQuery destination Dataset does not exist.
FailedPreconditionif the BigQuery destination Dataset/Table is in a different project.
Returns
TypeDescription
FeaturestoreThe featurestore resource object batch read feature values from.

batch_serve_to_df

batch_serve_to_df(
    serving_feature_ids: Dict[str, List[str]],
    read_instances_df: pd.DataFrame,
    pass_through_fields: Optional[List[str]] = None,
    feature_destination_fields: Optional[Dict[str, str]] = None,
    start_time: Optional[google.protobuf.timestamp_pb2.Timestamp] = None,
    request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
    serve_request_timeout: Optional[float] = None,
    bq_dataset_id: Optional[str] = None,
)

Batch serves feature values to pandas DataFrame

Parameters
NameDescription
serving_feature_ids Dict[str, List[str]]

Required. A user defined dictionary to define the entity_types and their features for batch serve/read. The keys of the dictionary are the serving entity_type ids and the values are lists of serving feature ids in each entity_type. .. rubric:: Example serving_feature_ids = { 'my_entity_type_id_1': ['feature_id_1_1', 'feature_id_1_2'], 'my_entity_type_id_2': ['feature_id_2_1', 'feature_id_2_2'], }

pass_through_fields List[str]

Optional. When not empty, the specified fields in the read_instances source will be joined as-is in the output, in addition to those fields from the Featurestore Entity. For BigQuery source, the type of the pass-through values will be automatically inferred. For CSV source, the pass-through values will be passed as opaque bytes.

feature_destination_fields Dict[str, str]

Optional. A user defined dictionary to map a feature's fully qualified resource name to its destination field name. If the destination field name is not defined, the feature ID will be used as its destination field name. .. rubric:: Example feature_destination_fields = { 'projects/123/locations/us-central1/featurestores/fs_id/entityTypes/et_id1/features/f_id11': 'foo', 'projects/123/locations/us-central1/featurestores/fs_id/entityTypes/et_id2/features/f_id22': 'bar', }

start_time timestamp_pb2.Timestamp

Optional. Excludes Feature values with feature generation timestamp before this timestamp. If not set, retrieve oldest values kept in Feature Store. Timestamp, if present, must not have higher than millisecond precision.

serve_request_timeout float

Optional. The timeout for the serve request in seconds.

bq_dataset_id str

Optional. The full dataset ID for the BigQuery dataset to use for temporarily staging data. If specified, caller must have bigquery.tables.create permissions for Dataset.

read_instances_df pd.DataFrame

Required. Read_instances_df is a pandas DataFrame containing the read instances. Each read instance should consist of exactly one read timestamp and one or more entity IDs identifying entities of the corresponding EntityTypes whose Features are requested. Each output instance contains Feature values of requested entities concatenated together as of the read time. An example read_instances_df may be pd.DataFrame( data=[ { "my_entity_type_id_1": "my_entity_type_id_1_entity_1", "my_entity_type_id_2": "my_entity_type_id_2_entity_1", "timestamp": "2020-01-01T10:00:00.123Z" ], ) An example batch_serve_output_df may be pd.DataFrame( data=[ { "my_entity_type_id_1": "my_entity_type_id_1_entity_1", "my_entity_type_id_2": "my_entity_type_id_2_entity_1", "foo": "feature_id_1_1_feature_value", "feature_id_1_2": "feature_id_1_2_feature_value", "feature_id_2_1": "feature_id_2_1_feature_value", "bar": "feature_id_2_2_feature_value", "timestamp": "2020-01-01T10:00:00.123Z" ], ) Timestamp in each read instance must be millisecond-aligned. The columns can be in any order. Values in the timestamp column must use the RFC 3339 format, e.g. 2012-07-30T10:43:17.123Z.

Returns
TypeDescription
pd.DataFrameThe pandas DataFrame containing feature values from batch serving.

batch_serve_to_gcs

batch_serve_to_gcs(
    gcs_destination_output_uri_prefix: str,
    gcs_destination_type: str,
    serving_feature_ids: Dict[str, List[str]],
    read_instances_uri: str,
    pass_through_fields: Optional[List[str]] = None,
    feature_destination_fields: Optional[Dict[str, str]] = None,
    start_time: Optional[google.protobuf.timestamp_pb2.Timestamp] = None,
    request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
    sync: bool = True,
    serve_request_timeout: Optional[float] = None,
)

Batch serves feature values to GCS destination

Parameters
NameDescription
gcs_destination_output_uri_prefix str

Required. Google Cloud Storage URI to output directory. If the uri doesn't end with '/', a '/' will be automatically appended. The directory is created if it doesn't exist. .. rubric:: Example "gs://bucket/path/to/prefix"

gcs_destination_type str

Required. The type of the destination files(s), the value of gcs_destination_type can only be either csv, or tfrecord. For CSV format. Array Feature value types are not allowed in CSV format. For TFRecord format. Below are the mapping from Feature value type in Featurestore to Feature value type in TFRecord: :: Value type in Featurestore Value type in TFRecord DOUBLE, DOUBLE_ARRAY FLOAT_LIST INT64, INT64_ARRAY INT64_LIST STRING, STRING_ARRAY, BYTES BYTES_LIST true -> byte_string("true"), false -> byte_string("false") BOOL, BOOL_ARRAY (true, false) BYTES_LIST

serving_feature_ids Dict[str, List[str]]

Required. A user defined dictionary to define the entity_types and their features for batch serve/read. The keys of the dictionary are the serving entity_type ids and the values are lists of serving feature ids in each entity_type. .. rubric:: Example serving_feature_ids = { 'my_entity_type_id_1': ['feature_id_1_1', 'feature_id_1_2'], 'my_entity_type_id_2': ['feature_id_2_1', 'feature_id_2_2'], }

read_instances_uri str

Required. Read_instances_uri can be either BigQuery URI to an input table, or Google Cloud Storage URI to a csv file. .. rubric:: Example 'bq://project.dataset.table_name' or "gs://my_bucket/my_file.csv" Each read instance should consist of exactly one read timestamp and one or more entity IDs identifying entities of the corresponding EntityTypes whose Features are requested. Each output instance contains Feature values of requested entities concatenated together as of the read time. An example read instance may be foo_entity_id, bar_entity_id, 2020-01-01T10:00:00.123Z. An example output instance may be foo_entity_id, bar_entity_id, 2020-01-01T10:00:00.123Z, foo_entity_feature1_value, bar_entity_feature2_value. Timestamp in each read instance must be millisecond-aligned. The columns can be in any order. Values in the timestamp column must use the RFC 3339 format, e.g. 2012-07-30T10:43:17.123Z.

pass_through_fields List[str]

Optional. When not empty, the specified fields in the read_instances source will be joined as-is in the output, in addition to those fields from the Featurestore Entity. For BigQuery source, the type of the pass-through values will be automatically inferred. For CSV source, the pass-through values will be passed as opaque bytes.

feature_destination_fields Dict[str, str]

Optional. A user defined dictionary to map a feature's fully qualified resource name to its destination field name. If the destination field name is not defined, the feature ID will be used as its destination field name. .. rubric:: Example feature_destination_fields = { 'projects/123/locations/us-central1/featurestores/fs_id/entityTypes/et_id1/features/f_id11': 'foo', 'projects/123/locations/us-central1/featurestores/fs_id/entityTypes/et_id2/features/f_id22': 'bar', }

start_time timestamp_pb2.Timestamp

Optional. Excludes Feature values with feature generation timestamp before this timestamp. If not set, retrieve oldest values kept in Feature Store. Timestamp, if present, must not have higher than millisecond precision.

serve_request_timeout float

Optional. The timeout for the serve request in seconds.

Exceptions
TypeDescription
ValueErroif gcs_destination_type is not supported.:
Returns
TypeDescription
FeaturestoreThe featurestore resource object batch read feature values from.

create

create(
    featurestore_id: str,
    online_store_fixed_node_count: Optional[int] = None,
    labels: Optional[Dict[str, str]] = None,
    project: Optional[str] = None,
    location: Optional[str] = None,
    credentials: Optional[google.auth.credentials.Credentials] = None,
    request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
    encryption_spec_key_name: Optional[str] = None,
    sync: bool = True,
    create_request_timeout: Optional[float] = None,
)

Creates a Featurestore resource.

Example Usage:

my_featurestore = aiplatform.Featurestore.create(
    featurestore_id='my_featurestore_id',
)
Parameters
NameDescription
featurestore_id str

Required. The ID to use for this Featurestore, which will become the final component of the Featurestore's resource name. This value may be up to 60 characters, and valid characters are [a-z0-9_]. The first character cannot be a number. The value must be unique within the project and location.

online_store_fixed_node_count int

Optional. Config for online serving resources. When not specified, no fixed node count for online serving. The number of nodes will not scale automatically but can be scaled manually by providing different values when updating.

labels Dict[str, str]

Optional. The labels with user-defined metadata to organize your Featurestore. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information on and examples of labels. No more than 64 user labels can be associated with one Featurestore(System labels are excluded)." System reserved label keys are prefixed with "aiplatform.googleapis.com/" and are immutable.

project str

Optional. Project to create EntityType in. If not set, project set in aiplatform.init will be used.

location str

Optional. Location to create EntityType in. If not set, location set in aiplatform.init will be used.

credentials auth_credentials.Credentials

Optional. Custom credentials to use to create EntityTypes. Overrides credentials set in aiplatform.init.

request_metadata Sequence[Tuple[str, str]]

Optional. Strings which should be sent along with the request as metadata.

encryption_spec str

Optional. Customer-managed encryption key spec for data storage. If set, both of the online and offline data storage will be secured by this key.

sync bool

Optional. Whether to execute this creation synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed.

create_request_timeout float

Optional. The timeout for the create request in seconds.

create_entity_type

create_entity_type(
    entity_type_id: str,
    description: Optional[str] = None,
    labels: Optional[Dict[str, str]] = None,
    request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
    sync: bool = True,
    create_request_timeout: Optional[float] = None,
)

Creates an EntityType resource in this Featurestore.

Example Usage:

my_featurestore = aiplatform.Featurestore.create(
    featurestore_id='my_featurestore_id'
)
my_entity_type = my_featurestore.create_entity_type(
    entity_type_id='my_entity_type_id',
)
Parameters
NameDescription
entity_type_id str

Required. The ID to use for the EntityType, which will become the final component of the EntityType's resource name. This value may be up to 60 characters, and valid characters are [a-z0-9_]. The first character cannot be a number. The value must be unique within a featurestore.

description str

Optional. Description of the EntityType.

labels Dict[str, str]

Optional. The labels with user-defined metadata to organize your EntityTypes. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information on and examples of labels. No more than 64 user labels can be associated with one EntityType (System labels are excluded)." System reserved label keys are prefixed with "aiplatform.googleapis.com/" and are immutable.

request_metadata Sequence[Tuple[str, str]]

Optional. Strings which should be sent along with the request as metadata.

sync bool

Optional. Whether to execute this creation synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed.

create_request_timeout float

Optional. The timeout for the create request in seconds.

delete

delete(sync: bool = True, force: bool = False)

Deletes this Featurestore resource. If force is set to True, all entityTypes in this Featurestore will be deleted prior to featurestore deletion, and all features in each entityType will be deleted prior to each entityType deletion.

WARNING: This deletion is permanent.

Parameters
NameDescription
force bool

If set to true, any EntityTypes and Features for this Featurestore will also be deleted. (Otherwise, the request will only work if the Featurestore has no EntityTypes.)

sync bool

Whether to execute this deletion synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed.

delete_entity_types

delete_entity_types(
    entity_type_ids: List[str], sync: bool = True, force: bool = False
)

Deletes entity_type resources in this Featurestore given their entity_type IDs. WARNING: This deletion is permanent.

Parameters
NameDescription
entity_type_ids List[str]

Required. The list of entity_type IDs to be deleted.

sync bool

Optional. Whether to execute this deletion synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed.

force bool

Optional. If force is set to True, all features in each entityType will be deleted prior to entityType deletion. Default is False.

get_entity_type

get_entity_type(entity_type_id: str)

Retrieves an existing managed entityType in this Featurestore.

Parameter
NameDescription
entity_type_id str

Required. The managed entityType resource ID in this Featurestore.

list

list(
    filter: Optional[str] = None,
    order_by: Optional[str] = None,
    project: Optional[str] = None,
    location: Optional[str] = None,
    credentials: Optional[google.auth.credentials.Credentials] = None,
    parent: Optional[str] = None,
)

List all instances of this Vertex AI Resource.

Example Usage:

aiplatform.BatchPredictionJobs.list( filter='state="JOB_STATE_SUCCEEDED" AND display_name="my_job"', )

aiplatform.Model.list(order_by="create_time desc, display_name")

Parameters
NameDescription
filter str

Optional. An expression for filtering the results of the request. For field names both snake_case and camelCase are supported.

order_by str

Optional. A comma-separated list of fields to order by, sorted in ascending order. Use "desc" after a field name for descending. Supported fields: display_name, create_time, update_time

project str

Optional. Project to retrieve list from. If not set, project set in aiplatform.init will be used.

location str

Optional. Location to retrieve list from. If not set, location set in aiplatform.init will be used.

credentials auth_credentials.Credentials

Optional. Custom credentials to use to retrieve list. Overrides credentials set in aiplatform.init.

parent str

Optional. The parent resource name if any to retrieve list from.

list_entity_types

list_entity_types(filter: Optional[str] = None, order_by: Optional[str] = None)

Lists existing managed entityType resources in this Featurestore.

Example Usage:

my_featurestore = aiplatform.Featurestore(
    featurestore_name='my_featurestore_id',
)
my_featurestore.list_entity_types()
Parameters
NameDescription
filter str

Optional. Lists the EntityTypes that match the filter expression. The following filters are supported: - create_time: Supports =, !=, <, >, >=, and <= comparisons. Values must be in RFC 3339 format. - update_time: Supports =, !=, <, >, >=, and <= comparisons. Values must be in RFC 3339 format. - labels: Supports key-value equality as well as key presence. Examples: - create_time > "2020-01-31T15:30:00.000000Z" OR update_time > "2020-01-31T15:30:00.000000Z" --> EntityTypes created or updated after 2020-01-31T15:30:00.000000Z. - labels.active = yes AND labels.env = prod --> EntityTypes having both (active: yes) and (env: prod) labels. - labels.env: * --> Any EntityType which has a label with 'env' as the key.

order_by str

Optional. A comma-separated list of fields to order by, sorted in ascending order. Use "desc" after a field name for descending. Supported fields: - entity_type_id - create_time - update_time

to_dict

to_dict()

Returns the resource proto as a dictionary.

update

update(
    labels: Optional[Dict[str, str]] = None,
    request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
    update_request_timeout: Optional[float] = None,
)

Updates an existing managed featurestore resource.

Example Usage:

my_featurestore = aiplatform.Featurestore(
    featurestore_name='my_featurestore_id',
)
my_featurestore.update(
    labels={'update my key': 'update my value'},
)
Parameters
NameDescription
labels Dict[str, str]

Optional. The labels with user-defined metadata to organize your Featurestores. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information on and examples of labels. No more than 64 user labels can be associated with one Feature (System labels are excluded)." System reserved label keys are prefixed with "aiplatform.googleapis.com/" and are immutable.

request_metadata Sequence[Tuple[str, str]]

Optional. Strings which should be sent along with the request as metadata.

update_request_timeout float

Optional. The timeout for the update request in seconds.

update_online_store

update_online_store(
    fixed_node_count: int,
    request_metadata: Optional[Sequence[Tuple[str, str]]] = (),
    update_request_timeout: Optional[float] = None,
)

Updates the online store of an existing managed featurestore resource.

Example Usage:

my_featurestore = aiplatform.Featurestore(
    featurestore_name='my_featurestore_id',
)
my_featurestore.update_online_store(
    fixed_node_count=2,
)
Parameters
NameDescription
fixed_node_count int

Required. Config for online serving resources, can only update the node count to >= 1.

request_metadata Sequence[Tuple[str, str]]

Optional. Strings which should be sent along with the request as metadata.

update_request_timeout float

Optional. The timeout for the update request in seconds.

wait

wait()

Helper method that blocks until all futures are complete.