AML output data model

This page describes the AML output data model. AML outputs are sent to BigQuery.

Prediction outputs

Prediction outputs include risk scores and explainability and are generated when you create a PredictionResult resource. For more information, see Understand prediction outputs.

Risk scores

Risk scores are written to the BigQuery table specified in the outputs.predictionDestination field.

Column Type Description
party_id STRING Unique party ID string
risk_period_end_time TIMESTAMP The end of the target period, in the timezone of the dataset
risk_score FLOAT64 Prediction value. Between 0 and 1. Higher score means higher risk.

Explainability

Explainability is written to the BigQuery table specified in the outputs.explainabilityDestination field.

Column Type Description
party_id STRING Unique party ID string
risk_period_end_time TIMESTAMP The end of the target period, in the timezone of the dataset
attributions STRUCT (repeated) Record of feature families and their attribution value
attributions.feature STRING Name of feature family
attributions.attribution FLOAT64 Feature family's attribution score

Exported registered parties

The following registered parties information is exported from an instance to the BigQuery table specified in the dataset field.

ColumnTypeDescription
party_idSTRINGUnique identifier of the party in the instance's datasets
party_sizeSTRING Specifies the tier for commercial customers (large versus small). This field does not apply to retail customers.
  • NULL for all retail customers
  • SMALL for small commercial parties with less than 500 average monthly transactions
  • LARGE for large commercial parties with greater than or equal to 500 average monthly transactions

All values are case sensitive.

earliest_remove_timeSTRINGThe earliest time at which the party can be removed

Exported metadata

Exported metadata varies based on the AML AI resource.

Engine config

The following metadata is output from an engine config.

ColumnTypeDescription
resource_typeSTRINGType of AML AI resource, such as an engine config or prediction results
resource_idSTRINGName of the resource
nameSTRINGName of the metadata entry, such as a metric (see the following table)
valueJSONValue of the metadata entry
Metric name Metric description Example metric value
ExpectedRecallPreTuning Recall metric measured on a test set when using default hyperparameters of the engine version.

This recall measurement assumes the number of investigations per month specified in partyInvestigationsPerPeriodHint.

{
  "recallValues": [
    {
      "partyInvestigationsPerPeriod": 5000,
      "recallValue": 0.72,
      "scoreThreshold": 0.42,
    },
  ],
}
ExpectedRecallPostTuning Recall metric measured on a test set when using tuned hyperparameters.

This recall measurement assumes the number of investigations per month specified in partyInvestigationsPerPeriodHint.

{
  "recallValues": [
    {
      "partyInvestigationsPerPeriod": 5000,
      "recallValue": 0.80,
      "scoreThreshold": 0.43,
    },
  ],
}
Missingness

Share of missing values across all features in each feature family.

Ideally, all AML AI feature families should have a Missingness near to 0. Exceptions may occur where the data underlying those feature families is unavailable for integration.

A significant change in this value for any feature family between tuning, training, evaluation, and prediction can indicate inconsistency in the datasets used.

{
  "featureFamilies": [
    {
      "featureFamily": "unusual_wire_credit_activity",
      "missingnessValue": 0.00,
    },
    ...
    ...
    {
      "featureFamily": "party_supplementary_data_id_3",
      "missingnessValue": 0.45,
    },
  ],
}

Model

The following metadata is output from a model.

ColumnTypeDescription
resource_typeSTRINGType of AML AI resource, such as an engine config or prediction results
resource_idSTRINGName of the resource
nameSTRINGName of the metadata entry, such as a metric (see the following table)
valueJSONValue of the metadata entry
Metric name Metric description Example metric value
Missingness

Share of missing values across all features in each feature family.

Ideally, all AML AI feature families should have a Missingness near to 0. Exceptions may occur where the data underlying those feature families is unavailable for integration.

A significant change in this value for any feature family between tuning, training, evaluation, and prediction can indicate inconsistency in the datasets used.

{
  "featureFamilies": [
    {
      "featureFamily": "unusual_wire_credit_activity",
      "missingnessValue": 0.00,
    },
    ...
    ...
    {
      "featureFamily": "party_supplementary_data_id_3",
      "missingnessValue": 0.45,
    },
  ],
}

Backtest results

The following metadata is output from backtest results.

ColumnTypeDescription
resource_typeSTRINGType of AML AI resource, such as an engine config or prediction results
resource_idSTRINGName of the resource
nameSTRINGName of the metadata entry, such as a metric (see the following table)
valueJSONValue of the metadata entry
Metric name Metric description Example metric value
ObservedRecallValues Recall metric measured on the dataset specified for backtesting. The API includes 20 of these measurements, at different operating points, evenly distributed from 0 (not included) until 2 * partyInvestigationsPerPeriodHint. The API adds a final recall measurement at partyInvestigationsPerPeriodHint.
{
  "recallValues": [
    {
      "partyInvestigationsPerPeriod": 5000,
      "recallValue": 0.80,
      "scoreThreshold": 0.42,
    },
    ...
    ...
    {
      "partyInvestigationsPerPeriod": 8000,
      "recallValue": 0.85,
      "scoreThreshold": 0.30,
    },
  ],
}
Missingness

Share of missing values across all features in each feature family.

Ideally, all AML AI feature families should have a Missingness near to 0. Exceptions may occur where the data underlying those feature families is unavailable for integration.

A significant change in this value for any feature family between tuning, training, evaluation, and prediction can indicate inconsistency in the datasets used.

{
  "featureFamilies": [
    {
      "featureFamily": "unusual_wire_credit_activity",
      "missingnessValue": 0.00,
    },
    ...
    ...
    {
      "featureFamily": "party_supplementary_data_id_3",
      "missingnessValue": 0.45,
    },
  ],
}

Prediction results

The following metadata is output from prediction results.

ColumnTypeDescription
resource_typeSTRINGType of AML AI resource, such as an engine config or prediction results
resource_idSTRINGName of the resource
nameSTRINGName of the metadata entry, such as a metric (see the following table)
valueJSONValue of the metadata entry
Metric name Metric description Example metric value
Missingness

Share of missing values across all features in each feature family.

Ideally, all AML AI feature families should have a Missingness near to 0. Exceptions may occur where the data underlying those feature families is unavailable for integration.

A significant change in this value for any feature family between tuning, training, evaluation, and prediction can indicate inconsistency in the datasets used.

{
  "featureFamilies": [
    {
      "featureFamily": "unusual_wire_credit_activity",
      "missingnessValue": 0.00,
    },
    ...
    ...
    {
      "featureFamily": "party_supplementary_data_id_3",
      "missingnessValue": 0.45,
    },
  ],
}