Class SQLScalarColumnTransformer (1.30.0)

SQLScalarColumnTransformer(sql: str, target_column: str = "transformed_{0}")

Wrapper for plain SQL code contained in a ColumnTransformer.

Create a single column transformer in plain sql. This transformer can only be used inside ColumnTransformer.

When creating an instance '{0}' can be used as placeholder for the column to transform:

SQLScalarColumnTransformer("{0}+1")

The default target column gets the prefix 'transformed_' but can also be changed when creating an instance:

SQLScalarColumnTransformer("{0}+1", "inc_{0}")

Examples:

>>> from bigframes.ml.compose import ColumnTransformer, SQLScalarColumnTransformer
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None

>>> df = bpd.DataFrame({'name': ["James", None, "Mary"], 'city': ["New York", "Boston", None]})
>>> col_trans = ColumnTransformer([
...     ("strlen",
...      SQLScalarColumnTransformer("CASE WHEN {0} IS NULL THEN 15 ELSE LENGTH({0}) END"),
...      ['name', 'city']),
... ])
>>> col_trans = col_trans.fit(df)
>>> df_transformed = col_trans.transform(df)
>>> df_transformed
   transformed_name  transformed_city
0                 5                 8
1                15                 6
2                 4                15
<BLANKLINE>
[3 rows x 2 columns]

SQLScalarColumnTransformer can be combined with other transformers, like StandardScaler:

>>> col_trans = ColumnTransformer([
...     ("identity", SQLScalarColumnTransformer("{0}", target_column="{0}"), ["col1", "col5"]),
...     ("increment", SQLScalarColumnTransformer("{0}+1", target_column="inc_{0}"), "col2"),
...     ("stdscale", preprocessing.StandardScaler(), "col3"),
...     # ...
... ])