- 1.33.0 (latest)
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
RandomForestClassifier(
n_estimators: int = 100,
*,
tree_method: typing.Literal["auto", "exact", "approx", "hist"] = "auto",
min_tree_child_weight: int = 1,
colsample_bytree: float = 1.0,
colsample_bylevel: float = 1.0,
colsample_bynode: float = 0.8,
gamma: float = 0.0,
max_depth: int = 15,
subsample: float = 0.8,
reg_alpha: float = 0.0,
reg_lambda: float = 1.0,
tol: float = 0.01,
enable_global_explain: bool = False,
xgboost_version: typing.Literal["0.9", "1.1"] = "0.9"
)
A random forest classifier.
A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.
Parameters |
|
---|---|
Name | Description |
n_estimators |
Optional[int]
Number of parallel trees constructed during each iteration. Default to 100. Minimum value is 2. |
tree_method |
Optional[str]
Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: "exact", "approx", "hist". |
min_child_weight |
Optional[float]
Minimum sum of instance weight(hessian) needed in a child. Default to 1. |
colsample_bytree |
Optional[float]
Subsample ratio of columns when constructing each tree. Default to 1.0. The value should be between 0 and 1. |
colsample_bylevel |
Optional[float]
Subsample ratio of columns for each level. Default to 1.0. The value should be between 0 and 1. |
colsample_bynode |
Optional[float]
Subsample ratio of columns for each split. Default to 0.8. The value should be between 0 and 1. |
gamma |
Optional[float]
(min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0. |
max_depth |
Optional[int]
Maximum tree depth for base learners. Default to 15. The value should be greater than 0 and less than 1. |
subsample |
Optional[float]
Subsample ratio of the training instance. Default to 0.8. The value should be greater than 0 and less than 1. |
reg_alpha |
Optional[float]
L1 regularization term on weights (xgb's alpha). Default to 0.0. |
reg_lambda |
Optional[float]
L2 regularization term on weights (xgb's lambda). Default to 1.0. |
tol |
Optional[float]
Minimum relative loss improvement necessary to continue training. Default to 0.01. |
enable_global_explain |
Optional[bool]
Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False. |
xgboost_version |
Optional[str]
Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".ß |
Methods
__repr__
__repr__()
Print the estimator's constructor with all non-default parameter values.
fit
fit(
X: typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
],
y: typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
],
X_eval: typing.Optional[
typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
]
] = None,
y_eval: typing.Optional[
typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
]
] = None,
) -> bigframes.ml.base._T
Build a forest of trees from the training set (X, y).
Parameters | |
---|---|
Name | Description |
X |
bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series
Series or DataFrame of shape (n_samples, n_features). Training data. |
y |
bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series
Series or DataFrame of shape (n_samples,) or (n_samples, n_targets). Target values. Will be cast to X's dtype if necessary. |
X_eval |
bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series
Series or DataFrame of shape (n_samples, n_features). Evaluation data. |
y_eval |
bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series
Series or DataFrame of shape (n_samples,) or (n_samples, n_targets). Evaluation target values. Will be cast to X_eval's dtype if necessary. |
Returns | |
---|---|
Type | Description |
ForestModel |
Fitted estimator. |
get_params
get_params(deep: bool = True) -> typing.Dict[str, typing.Any]
Get parameters for this estimator.
Parameter | |
---|---|
Name | Description |
deep |
bool, default True
Default |
Returns | |
---|---|
Type | Description |
Dictionary |
A dictionary of parameter names mapped to their values. |
predict
predict(
X: typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
]
) -> bigframes.dataframe.DataFrame
Predict regression target for X.
The predicted regression target of an input sample is computed as the mean predicted regression targets of the trees in the forest.
Returns | |
---|---|
Type | Description |
bigframes.dataframe.DataFrame |
The predicted values. |
register
register(vertex_ai_model_id: typing.Optional[str] = None) -> bigframes.ml.base._T
Register the model to Vertex AI.
After register, go to the Google Cloud console (https://console.cloud.google.com/vertex-ai/models) to manage the model registries. Refer to https://cloud.google.com/vertex-ai/docs/model-registry/introduction for more options.
Parameter | |
---|---|
Name | Description |
vertex_ai_model_id |
Optional[str], default None
Optional string id as model id in Vertex. If not set, will default to 'bigframes_{bq_model_id}'. Vertex Ai model id will be truncated to 63 characters due to its limitation. |
score
score(
X: typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
],
y: typing.Union[
bigframes.dataframe.DataFrame,
bigframes.series.Series,
pandas.core.frame.DataFrame,
pandas.core.series.Series,
],
)
Calculate evaluation metrics of the model.
Parameters | |
---|---|
Name | Description |
X |
bigframes.dataframe.DataFrame or bigframes.series.Series
A BigQuery DataFrame as evaluation data. |
y |
bigframes.dataframe.DataFrame or bigframes.series.Series
A BigQuery DataFrame as evaluation labels. |
Returns | |
---|---|
Type | Description |
bigframes.dataframe.DataFrame |
The DataFrame as evaluation result. |
to_gbq
to_gbq(
model_name: str, replace: bool = False
) -> bigframes.ml.ensemble.RandomForestClassifier
Save the model to BigQuery.
Parameters | |
---|---|
Name | Description |
model_name |
str
The name of the model. |
replace |
bool, default False
Determine whether to replace if the model already exists. Default to False. |
Returns | |
---|---|
Type | Description |
RandomForestClassifier |
Saved model. |