This page briefly covers concepts behind configuring an engine.
Supported sources for hyperparameters
When configuring an engine, you can select the source of the hyperparameters that are used to create a model. The following sources are supported:
- Automatic tuning: AML AI tunes hyperparameters when you create an EngineConfig resource (default behavior)
- Inherit: Inherit hyperparameters from a previous engine config that was created with an earlier engine version within the same tuning version. This setting lets you avoid re-tuning each time you adopt a new model engine version.
When to tune or inherit
The following sections outline when you should select automatic tuning and when you should inherit hyperparameters from a previous engine config.
When to tune
You have the option to tune each new engine config and, when in doubt, you should always tune for best performance outcomes. See section How to tune an engine for more information.
For best performance, you should consider engine tuning when any of the following occur:
- You make significant changes to dataset logic. For example, when any of the
following change:
- The logic by which fields are populated
- The selection of RECOMMENDED fields that are populated
- The logic or selection of data provided in the PartySupplementaryData table
- You're about to have an engine train a model for a new region.
When to inherit hyperparameters
To save time and costs when adopting a new engine version, you can inherit hyperparameters from a previous engine using the same tuning version. See section How to adopt an engine version without re-tuning.
Engine versions with tuning version v003, and engine versions released prior to 2024-02-22, don't support the inheriting of hyperparameters, but these versions may be used as a source of hyperparameters.
How to tune an engine
To trigger tuning, see Create and manage engine configs.
In particular, you need to select the following:
The data to use for engine tuning:
Specify a dataset and an end time within the date range of the dataset.
Engine tuning uses labels and features based on complete calendar months up to, but not including, the month of the selected end time. For more information, see Dataset time ranges.
The engine version to use for engine tuning:
Select an engine version that matches the line of business (retail or commercial) that you will use the associated models for.
The volume of investigations you expect based on the models:
Specify
partyInvestigationsPerPeriodHint
. This is used by engine tuning, training, and backtesting to ensure AML AI delivers performance at your monthly investigation volume.
Engine tuning output
Engine tuning generates an EngineConfig resource, which can be used to create a Model resource.
The engine config metadata contains the following metrics. In particular, these metrics show you the following:
Expected performance gain from engine tuning versus using the default hyperparameters
Measurements which can be used to assess dataset consistency (for example, by comparing the missingness values of feature families from different operations)
Metric name | Metric description | Example metric value |
---|---|---|
ExpectedRecallPreTuning | Recall metric measured on a test set when using
default hyperparameters of the engine version.
This recall measurement assumes the number of investigations per month
specified in |
{ "recallValues": [ { "partyInvestigationsPerPeriod": 5000, "recallValue": 0.72, "scoreThreshold": 0.42, }, ], } |
ExpectedRecallPostTuning | Recall metric measured on a test set when using
tuned hyperparameters.
This recall measurement assumes the number of investigations per month
specified in |
{ "recallValues": [ { "partyInvestigationsPerPeriod": 5000, "recallValue": 0.80, "scoreThreshold": 0.43, }, ], } |
Missingness |
Share of missing values across all features in each feature family. Ideally, all AML AI feature families should have a Missingness near to 0. Exceptions may occur where the data underlying those feature families is unavailable for integration. A significant change in this value for any feature family between tuning, training, evaluation, and prediction can indicate inconsistency in the datasets used. |
{ "featureFamilies": [ { "featureFamily": "unusual_wire_credit_activity", "missingnessValue": 0.00, }, ... ... { "featureFamily": "party_supplementary_data_id_3", "missingnessValue": 0.45, }, ], } |
How to adopt an engine version without re-tuning
To re-use hyperparameters from a previous engine config, see section Create an engine config that inherits hyperparameters (on the Create and manage engine configs page). In particular, you need to select the following:
- Hyperparameter source type: Select
INHERITED
as thehyperparameterSourceType
. If you don't specify the source type, the hyperparameter source type is set toTUNING
to allow for backwards compatibility. - Hyperparameter source: Specify the full resource name of the source
engine config in the
hyperparameterSource
object. The outputs of the source engine config are used for the new engine config. The source engine config must have been created with an earlier engine version within the same tuning engine version that you are now using. - Engine version to use for the engine config: Select an engine version that matches the line of business (retail or commercial) for the models you want to use. If inheriting hyperparameters, the line of business must match the line of business used for the hyperparameter source.
Output and lineage when inheriting
Inheriting hyperparameters from another engine version creates an EngineConfig resource which can be used to create a Model resource using the hyperparameters from the source engine config.
For lineage, the following fields in the EngineConfig resource are set as follows when inheriting hyperparameters from another engine config:
hyperparameterSourceType
:INHERITED
hyperparameterSource
: The engine config used as the hyperparameter sourcetuning
: Originaltuning
object, including the reference to the dataset used for the original engine tuning (primaryDataset
) and the latest time from which data was used to generate features for training (endTime
)performanceTarget
: OriginalperformanceTarget
object, including the volume of investigations expected based on the specified models (partyInvestigationsPerPeriodHint
)- Engine config metadata from the original engine tuning