This page shows you the steps to create and manage AML AI datasets. A dataset is used as an input for the engine configuration, training, backtest, and prediction pipelines. An AML AI dataset contains references to BigQuery tables matching the AML AI input data model in a Google Cloud project.
Prerequisites
-
To get the permissions that you need to create and manage datasets, ask your administrator to grant you the Financial Services Admin (
financialservices.admin
) IAM role on your project. For more information about granting roles, see Manage access to projects, folders, and organizations.You might also be able to get the required permissions through custom roles or other predefined roles.
- Create an instance
-
Some API methods return a long-running operation (LRO). These methods are asynchronous and return an Operation object; for details, see REST Reference. The operation might not be completed when the method returns a response. For these methods, send the request and then check for the result. In general, all POST, PUT, UPDATE, and DELETE operations are long-running.
Create a dataset
To create a dataset, send the create request and then check for the result of the LRO.
Send the request
To create a dataset, use the
projects.locations.instances.datasets.create
method.
Permissions required for this task
To perform this task, you must have been granted the following permissions:
Permissions
financialservices.v1datasets.create
Before using any of the request data, make the following replacements:
: your Google Cloud project ID listed in the IAM SettingsPROJECT_ID
: the location of the instance; use one of the supported regionsLOCATION Show locationsus-central1
us-east1
asia-south1
europe-west1
europe-west2
europe-west4
northamerica-northeast1
southamerica-east1
australia-southeast1
: the user-defined identifier for the instanceINSTANCE_ID
: a user-defined identifier for the AML AI dataset; use only lowercase letters, numbers, dashes, and underscores (for example,DATASET_ID train_jan2018_apr2020
)
: the BigQuery input dataset nameBQ_INPUT_DATASET_NAME
: the Party table in the BigQuery input datasetPARTY_TABLE
: the AccountPartyLink table in the BigQuery input datasetACCOUNT_PARTY_LINK_TABLE
: the Transaction table in the BigQuery input datasetTRANSACTION_TABLE
: the RiskCaseEvent table in the BigQuery input datasetRISK_CASE_EVENT_TABLE
: the PartySupplementaryData table in the BigQuery input dataset; this table is optional and can be removed from the request JSONPARTY_SUPPLEMENTARY_DATA
: the start date and time of the data to use in the dataset; use RFC3339 UTC "Zulu" format (for example,DATA_START_DATE 2014-10-02T15:01:23Z
)
: the end date and time of the data to use in the dataset; use RFC3339 UTC "Zulu" format (for example,DATA_END_DATE 2014-10-02T15:01:23Z
)
Request JSON body:
{ "tableSpecs": { "party": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_TABLE ", "account_party_link": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .ACCOUNT_PARTY_LINK_TABLE ", "transaction": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .TRANSACTION_TABLE ", "risk_case_event": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .RISK_CASE_EVENT_TABLE ", "party_supplementary_data": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_SUPPLEMENTARY_DATA " }, "dateRange": { "startTime": "DATA_START_DATE ", "endTime": "DATA_END_DATE " }, "timeZone": { "id": "UTC" } }
To send your request, choose one of these options:
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
cat > request.json << 'EOF' { "tableSpecs": { "party": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_TABLE ", "account_party_link": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .ACCOUNT_PARTY_LINK_TABLE ", "transaction": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .TRANSACTION_TABLE ", "risk_case_event": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .RISK_CASE_EVENT_TABLE ", "party_supplementary_data": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_SUPPLEMENTARY_DATA " }, "dateRange": { "startTime": "DATA_START_DATE ", "endTime": "DATA_END_DATE " }, "timeZone": { "id": "UTC" } } EOF
Then execute the following command to send your REST request:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets?dataset_id=DATASET_ID "
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
@' { "tableSpecs": { "party": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_TABLE ", "account_party_link": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .ACCOUNT_PARTY_LINK_TABLE ", "transaction": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .TRANSACTION_TABLE ", "risk_case_event": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .RISK_CASE_EVENT_TABLE ", "party_supplementary_data": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_SUPPLEMENTARY_DATA " }, "dateRange": { "startTime": "DATA_START_DATE ", "endTime": "DATA_END_DATE " }, "timeZone": { "id": "UTC" } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets?dataset_id=DATASET_ID " | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_ID /locations/LOCATION /operations/OPERATION_ID ", "metadata": { "@type": "type.googleapis.com/google.cloud.financialservices.v1.OperationMetadata", "createTime":CREATE_TIME , "target": "projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID ", "verb": "create", "requestedCancellation": false, "apiVersion": "v1" }, "done": false }
Copy the returned
to use in the next section.
Check for the result
Use the
projects.locations.operations.get
method to check if the dataset has been created. If the response contains
"done": false
, repeat the command until the response contains "done": true
.
These operations can take a few minutes to several hours to complete.
Permissions required for this task
To perform this task, you must have been granted the following permissions:
Permissions
financialservices.operations.get
Before using any of the request data, make the following replacements:
: your Google Cloud project ID listed in the IAM SettingsPROJECT_ID
: the location of the instance; use one of the supported regionsLOCATION Show locationsus-central1
us-east1
asia-south1
europe-west1
europe-west2
europe-west4
northamerica-northeast1
southamerica-east1
australia-southeast1
: the identifier for the operationOPERATION_ID
To send your request, choose one of these options:
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /operations/OPERATION_ID "
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /operations/OPERATION_ID " | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_ID /locations/LOCATION /operations/OPERATION_ID ", "metadata": { "@type": "type.googleapis.com/google.cloud.financialservices.v1.OperationMetadata", "createTime": "2023-03-14T15:52:55.358979323Z", "endTime": "2023-03-14T16:52:55.358979323Z", "target": "projects/PROJECT_ID /locations/LOCATION /datasets/DATASET_ID ", "verb": "create", "requestedCancellation": false, "apiVersion": "v1" }, "done": true, "response": { "@type": "type.googleapis.com/google.cloud.financialservices.v1.Dataset", "name": "projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID ", "createTime":CREATE_TIME , "updateTime":UPDATE_TIME , "tableSpecs": { "party": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_TABLE ", "account_party_link": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .ACCOUNT_PARTY_LINK_TABLE ", "transaction": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .TRANSACTION_TABLE ", "risk_case_event": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .RISK_CASE_EVENT_TABLE ", "party_supplementary_data": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_SUPPLEMENTARY_DATA " }, "state": "ACTIVE", "dateRange": { "start_time": "DATA_START_DATE ", "end_time": "DATA_END_DATE " }, "timeZone": { "id": "UTC" } } }
Get a dataset
To get a dataset, use the
projects.locations.instances.datasets.get
method.
Permissions required for this task
To perform this task, you must have been granted the following permissions:
Permissions
financialservices.v1datasets.get
Before using any of the request data, make the following replacements:
: your Google Cloud project ID listed in the IAM SettingsPROJECT_ID
: the location of the instance; use one of the supported regionsLOCATION Show locationsus-central1
us-east1
asia-south1
europe-west1
europe-west2
europe-west4
northamerica-northeast1
southamerica-east1
australia-southeast1
: the user-defined identifier for the instanceINSTANCE_ID
: the user-defined identifier for the datasetDATASET_ID
To send your request, choose one of these options:
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID "
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID " | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID ", "createTime":CREATE_TIME , "updateTime":UPDATE_TIME , "tableSpecs": { "party": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_TABLE ", "account_party_link": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .ACCOUNT_PARTY_LINK_TABLE ", "transaction": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .TRANSACTION_TABLE ", "risk_case_event": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .RISK_CASE_EVENT_TABLE ", "party_supplementary_data": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_SUPPLEMENTARY_DATA " }, "state": "ACTIVE", "dateRange": { "start_time": "DATA_START_DATE ", "end_time": "DATA_END_DATE " }, "timeZone": { "id": "UTC" } }
Update a dataset
To update a dataset, use the
projects.locations.instances.datasets.patch
method.
The only fields which can be updated are label fields in AML AI. The following example updates the key-value pair user labels associated with the dataset.
Permissions required for this task
To perform this task, you must have been granted the following permissions:
Permissions
financialservices.v1datasets.update
Before using any of the request data, make the following replacements:
: your Google Cloud project ID listed in the IAM SettingsPROJECT_ID
: the location of the instance; use one of the supported regionsLOCATION Show locationsus-central1
us-east1
asia-south1
europe-west1
europe-west2
europe-west4
northamerica-northeast1
southamerica-east1
australia-southeast1
: a user-defined identifier for the instanceINSTANCE_ID
: the user-defined identifier for the datasetDATASET_ID
: The key in a key-value pair used to organize datasets. SeeKEY labels
for more information.
: The value in a key-value pair used to organize datasets. SeeVALUE labels
for more information.
Request JSON body:
{ "labels": { "KEY ": "VALUE " } }
To send your request, choose one of these options:
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
cat > request.json << 'EOF' { "labels": { "KEY ": "VALUE " } } EOF
Then execute the following command to send your REST request:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID ?updateMask=labels"
Save the request body in a file named request.json
.
Run the following command in the terminal to create or overwrite
this file in the current directory:
@' { "labels": { "KEY ": "VALUE " } } '@ | Out-File -FilePath request.json -Encoding utf8
Then execute the following command to send your REST request:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID ?updateMask=labels" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_ID /locations/LOCATION /operations/OPERATION_ID ", "metadata": { "@type": "type.googleapis.com/google.cloud.financialservices.v1.OperationMetadata", "createTime":CREATE_TIME , "target": "projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID ", "verb": "update", "requestedCancellation": false, "apiVersion": "v1" }, "done": false }
For more information on how to get the result of the long-running operation (LRO), see Check for the result.
List the datasets
To list the datasets for a given instance, use the
projects.locations.instances.datasets.list
method.
Permissions required for this task
To perform this task, you must have been granted the following permissions:
Permissions
financialservices.v1datasets.list
Before using any of the request data, make the following replacements:
: your Google Cloud project ID listed in the IAM SettingsPROJECT_ID
: the location of the instance; use one of the supported regionsLOCATION Show locationsus-central1
us-east1
asia-south1
europe-west1
europe-west2
europe-west4
northamerica-northeast1
southamerica-east1
australia-southeast1
: the user-defined identifier for the instanceINSTANCE_ID
To send your request, choose one of these options:
Execute the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets"
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "datasets": [ { "name": "projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID ", "createTime":CREATE_TIME , "updateTime":UPDATE_TIME , "tableSpecs": { "party": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_TABLE ", "account_party_link": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .ACCOUNT_PARTY_LINK_TABLE ", "transaction": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .TRANSACTION_TABLE ", "risk_case_event": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .RISK_CASE_EVENT_TABLE ", "party_supplementary_data": "bq://PROJECT_ID .BQ_INPUT_DATASET_NAME .PARTY_SUPPLEMENTARY_DATA " }, "state": "ACTIVE", "dateRange": { "start_time": "DATA_START_DATE ", "end_time": "DATA_END_DATE " }, "timeZone": { "id": "UTC" } } ] }
Delete a dataset
To delete a dataset, use the
projects.locations.instances.datasets.delete
method.
Permissions required for this task
To perform this task, you must have been granted the following permissions:
Permissions
financialservices.v1datasets.delete
Before using any of the request data, make the following replacements:
: your Google Cloud project ID listed in the IAM SettingsPROJECT_ID
: the location of the instance; use one of the supported regionsLOCATION Show locationsus-central1
us-east1
asia-south1
europe-west1
europe-west2
europe-west4
northamerica-northeast1
southamerica-east1
australia-southeast1
: the user-defined identifier for the instanceINSTANCE_ID
: the user-defined identifier for the datasetDATASET_ID
To send your request, choose one of these options:
Execute the following command:
curl -X DELETE \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID "
Execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://financialservices.googleapis.com/v1/projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID " | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_ID /locations/LOCATION /operations/OPERATION_ID ", "metadata": { "@type": "type.googleapis.com/google.cloud.financialservices.v1.OperationMetadata", "createTime":CREATE_TIME , "target": "projects/PROJECT_ID /locations/LOCATION /instances/INSTANCE_ID /datasets/DATASET_ID ", "verb": "delete", "requestedCancellation": false, "apiVersion": "v1" }, "done": false }
For more information on how to get the result of the long-running operation (LRO), see Check for the result.