- 1.29.0 (latest)
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
SQLScalarColumnTransformer(sql: str, target_column: str = "transformed_{0}")
Wrapper for plain SQL code contained in a ColumnTransformer.
Create a single column transformer in plain sql. This transformer can only be used inside ColumnTransformer.
When creating an instance '{0}' can be used as placeholder for the column to transform:
SQLScalarColumnTransformer("{0}+1")
The default target column gets the prefix 'transformed_' but can also be changed when creating an instance:
SQLScalarColumnTransformer("{0}+1", "inc_{0}")
Examples:
>>> from bigframes.ml.compose import ColumnTransformer, SQLScalarColumnTransformer
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({'name': ["James", None, "Mary"], 'city': ["New York", "Boston", None]})
>>> col_trans = ColumnTransformer([
... ("strlen",
... SQLScalarColumnTransformer("CASE WHEN {0} IS NULL THEN 15 ELSE LENGTH({0}) END"),
... ['name', 'city']),
... ])
>>> col_trans = col_trans.fit(df)
>>> df_transformed = col_trans.transform(df)
>>> df_transformed
transformed_name transformed_city
0 5 8
1 15 6
2 4 15
<BLANKLINE>
[3 rows x 2 columns]
SQLScalarColumnTransformer can be combined with other transformers, like StandardScaler:
>>> col_trans = ColumnTransformer([
... ("identity", SQLScalarColumnTransformer("{0}", target_column="{0}"), ["col1", "col5"]),
... ("increment", SQLScalarColumnTransformer("{0}+1", target_column="inc_{0}"), "col2"),
... ("stdscale", preprocessing.StandardScaler(), "col3"),
... # ...
... ])