Class Transformation (1.1.1)

Transformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Inheritance

builtins.object > proto.message.Message > Transformation

Classes

AutoTransformation

AutoTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will infer the proper transformation based on the statistic of dataset.

CategoricalTransformation

CategoricalTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

  • The categorical string as is--no change to case, punctuation, spelling, tense, and so on.

  • Convert the category name to a dictionary lookup index and generate an embedding for each index.

  • Categories that appear less than 5 times in the training dataset are treated as the "unknown" category. The "unknown" category gets its own special lookup index and resulting embedding.

NumericTransformation

NumericTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

  • The value converted to float32.

  • The z_score of the value.

  • log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value.

  • z_score of log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value.

  • A boolean value that indicates whether the value is valid.

TextTransformation

TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

  • The text as is--no change to case, punctuation, spelling, tense, and so on.

  • Convert the category name to a dictionary lookup index and generate an embedding for each index.

TimestampTransformation

TimestampTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

  • Apply the transformation functions for Numerical columns.

  • Determine the year, month, day,and weekday. Treat each value from the timestamp as a Categorical column.

  • Invalid numerical values (for example, values that fall outside of a typical timestamp range, or are extreme values) receive no special treatment and are not removed.