- 1.67.0 (latest)
- 1.66.0
- 1.65.0
- 1.63.0
- 1.62.0
- 1.60.0
- 1.59.0
- 1.58.0
- 1.57.0
- 1.56.0
- 1.55.0
- 1.54.1
- 1.53.0
- 1.52.0
- 1.51.0
- 1.50.0
- 1.49.0
- 1.48.0
- 1.47.0
- 1.46.0
- 1.45.0
- 1.44.0
- 1.43.0
- 1.39.0
- 1.38.1
- 1.37.0
- 1.36.4
- 1.35.0
- 1.34.0
- 1.33.1
- 1.32.0
- 1.31.1
- 1.30.1
- 1.29.0
- 1.28.1
- 1.27.1
- 1.26.1
- 1.25.0
- 1.24.1
- 1.23.0
- 1.22.1
- 1.21.0
- 1.20.0
- 1.19.1
- 1.18.3
- 1.17.1
- 1.16.1
- 1.15.1
- 1.14.0
- 1.13.1
- 1.12.1
- 1.11.0
- 1.10.0
- 1.9.0
- 1.8.1
- 1.7.1
- 1.6.2
- 1.5.0
- 1.4.3
- 1.3.0
- 1.2.0
- 1.1.1
- 1.0.1
- 0.9.0
- 0.8.0
- 0.7.1
- 0.6.0
- 0.5.1
- 0.4.0
- 0.3.1
Transformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Inheritance
builtins.object > proto.message.Message > TransformationClasses
AutoTransformation
AutoTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Training pipeline will infer the proper transformation based on the statistic of dataset.
CategoricalArrayTransformation
CategoricalArrayTransformation(
mapping=None, *, ignore_unknown_fields=False, **kwargs
)
Treats the column as categorical array and performs following transformation functions.
- For each element in the array, convert the category name to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
- Empty arrays treated as an embedding of zeroes.
CategoricalTransformation
CategoricalTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Training pipeline will perform following transformation functions.
- The categorical string as is--no change to case, punctuation, spelling, tense, and so on.
- Convert the category name to a dictionary lookup index and generate an embedding for each index.
- Categories that appear less than 5 times in the training dataset are treated as the "unknown" category. The "unknown" category gets its own special lookup index and resulting embedding.
NumericArrayTransformation
NumericArrayTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Treats the column as numerical array and performs following transformation functions.
- All transformations for Numerical types applied to the average of the all elements.
- The average of empty arrays is treated as zero.
NumericTransformation
NumericTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Training pipeline will perform following transformation functions.
- The value converted to float32.
- The z_score of the value.
- log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value.
- z_score of log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value.
- A boolean value that indicates whether the value is valid.
TextArrayTransformation
TextArrayTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Treats the column as text array and performs following transformation functions.
- Concatenate all text values in the array into a single text value using a space (" ") as a delimiter, and then treat the result as a single text value. Apply the transformations for Text columns.
- Empty arrays treated as an empty text.
TextTransformation
TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Training pipeline will perform following transformation functions.
- The text as is--no change to case, punctuation, spelling, tense, and so on.
- Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
- Tokenization is based on unicode script boundaries.
- Missing values get their own lookup index and resulting embedding.
- Stop-words receive no special treatment and are not removed.
TimestampTransformation
TimestampTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)
Training pipeline will perform following transformation functions.
- Apply the transformation functions for Numerical columns.
- Determine the year, month, day,and weekday. Treat each value from the
- timestamp as a Categorical column.
- Invalid numerical values (for example, values that fall outside of a typical timestamp range, or are extreme values) receive no special treatment and are not removed.