Class TextTransformation (1.48.0)

TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

The text as is--no change to case, punctuation, spelling, tense, and so on.
Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
Tokenization is based on unicode script boundaries.
Missing values get their own lookup index and resulting embedding.
Stop-words receive no special treatment and are not removed.

Methods

TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

The text as is--no change to case, punctuation, spelling, tense, and so on.
Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
Tokenization is based on unicode script boundaries.
Missing values get their own lookup index and resulting embedding.
Stop-words receive no special treatment and are not removed.

TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

The text as is--no change to case, punctuation, spelling, tense, and so on.
Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
Tokenization is based on unicode script boundaries.
Missing values get their own lookup index and resulting embedding.
Stop-words receive no special treatment and are not removed.