Class TextTransformation (1.36.4)

TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

  • The text as is--no change to case, punctuation, spelling, tense, and so on.
  • Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
  • Tokenization is based on unicode script boundaries.
  • Missing values get their own lookup index and resulting embedding.
  • Stop-words receive no special treatment and are not removed.

Methods

TextTransformation

TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

  • The text as is--no change to case, punctuation, spelling, tense, and so on.
  • Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
  • Tokenization is based on unicode script boundaries.
  • Missing values get their own lookup index and resulting embedding.
  • Stop-words receive no special treatment and are not removed.

TextTransformation

TextTransformation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Training pipeline will perform following transformation functions.

  • The text as is--no change to case, punctuation, spelling, tense, and so on.
  • Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
  • Tokenization is based on unicode script boundaries.
  • Missing values get their own lookup index and resulting embedding.
  • Stop-words receive no special treatment and are not removed.