- 1.33.0 (latest)
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
SimpleImputer(strategy: typing.Literal["mean", "median", "most_frequent"] = "mean")
Univariate imputer for completing missing values with simple strategies.
Replace missing values using a descriptive statistic (e.g. mean, median, or most frequent) along each column.
Examples:
>>> import bigframes.pandas as bpd
>>> from bigframes.ml.impute import SimpleImputer
>>> bpd.options.display.progress_bar = None
>>> X_train = bpd.DataFrame({"feat0": [7.0, 4.0, 10.0], "feat1": [2.0, None, 5.0], "feat2": [3.0, 6.0, 9.0]})
>>> imp_mean = SimpleImputer().fit(X_train)
>>> X_test = bpd.DataFrame({"feat0": [None, 4.0, 10.0], "feat1": [2.0, None, None], "feat2": [3.0, 6.0, 9.0]})
>>> imp_mean.transform(X_test)
imputer_feat0 imputer_feat1 imputer_feat2
0 7.0 2.0 3.0
1 4.0 3.5 6.0
2 10.0 3.5 9.0
<BLANKLINE>
[3 rows x 3 columns]
Parameter |
|
---|---|
Name | Description |
strategy |
{'mean', 'median', 'most_frequent'}, default='mean'
The imputation strategy. 'mean': replace missing values using the mean along the axis. 'median':replace missing values using the median along the axis. 'most_frequent', replace missing using the most frequent value along the axis. |
Methods
__repr__
__repr__()
Print the estimator's constructor with all non-default parameter values.
fit
fit(
X: typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
],
y=None,
) -> bigframes.ml.impute.SimpleImputer
Fit the imputer on X.
Parameters | |
---|---|
Name | Description |
X |
bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series
The Dataframe or Series with training data. |
y |
default None
Ignored. |
Returns | |
---|---|
Type | Description |
SimpleImputer |
Fitted scaler. |
fit_transform
fit_transform(
X: typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
],
y: typing.Optional[
typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
]
] = None,
) -> bigframes.dataframe.DataFrame
Fit to data, then transform it.
Parameters | |
---|---|
Name | Description |
X |
bigframes.dataframe.DataFrame or bigframes.series.Series
Series or DataFrame of shape (n_samples, n_features). Input samples. |
y |
bigframes.dataframe.DataFrame or bigframes.series.Series
Series or DataFrame of shape (n_samples,) or (n_samples, n_outputs). Default None. Target values (None for unsupervised transformations). |
Returns | |
---|---|
Type | Description |
bigframes.dataframe.DataFrame |
DataFrame of shape (n_samples, n_features_new). Transformed DataFrame. |
get_params
get_params(deep: bool = True) -> typing.Dict[str, typing.Any]
Get parameters for this estimator.
Parameter | |
---|---|
Name | Description |
deep |
bool, default True
Default |
Returns | |
---|---|
Type | Description |
Dictionary |
A dictionary of parameter names mapped to their values. |
to_gbq
to_gbq(model_name: str, replace: bool = False) -> bigframes.ml.base._T
Save the transformer as a BigQuery model.
Parameters | |
---|---|
Name | Description |
model_name |
str
The name of the model. |
replace |
bool, default False
Determine whether to replace if the model already exists. Default to False. |
transform
transform(
X: typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
]
) -> bigframes.dataframe.DataFrame
Impute all missing values in X.
Parameter | |
---|---|
Name | Description |
X |
bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series
The DataFrame or Series to be transformed. |
Returns | |
---|---|
Type | Description |
bigframes.dataframe.DataFrame |
Transformed result. |