Class XGBClassifier (2.24.0)

XGBClassifier(
    n_estimators: int = 1,
    *,
    booster: typing.Literal["gbtree", "dart"] = "gbtree",
    dart_normalized_type: typing.Literal["tree", "forest"] = "tree",
    tree_method: typing.Literal["auto", "exact", "approx", "hist"] = "auto",
    min_tree_child_weight: int = 1,
    colsample_bytree: float = 1.0,
    colsample_bylevel: float = 1.0,
    colsample_bynode: float = 1.0,
    gamma: float = 0.0,
    max_depth: int = 6,
    subsample: float = 1.0,
    reg_alpha: float = 0.0,
    reg_lambda: float = 1.0,
    learning_rate: float = 0.3,
    max_iterations: int = 20,
    tol: float = 0.01,
    enable_global_explain: bool = False,
    xgboost_version: typing.Literal["0.9", "1.1"] = "0.9"
)

XGBoost classifier model.

Parameters
Name	Description
`n_estimators`	`Optional[int]` Number of parallel trees constructed during each iteration. Default to 1.
`booster`	`Optional[str]` Specify which booster to use: gbtree or dart. Default to "gbtree".
`dart_normalized_type`	`Optional[str]` Type of normalization algorithm for DART booster. Possible values: "TREE", "FOREST". Default to "TREE".
`tree_method`	`Optional[str]` Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: "exact", "approx", "hist".
`min_child_weight`	`Optional[float]` Minimum sum of instance weight(hessian) needed in a child. Default to 1.
`colsample_bytree`	`Optional[float]` Subsample ratio of columns when constructing each tree. Default to 1.0.
`colsample_bylevel`	`Optional[float]` Subsample ratio of columns for each level. Default to 1.0.
`colsample_bynode`	`Optional[float]` Subsample ratio of columns for each split. Default to 1.0.
`gamma`	`Optional[float]` (min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.
`max_depth`	`Optional[int]` Maximum tree depth for base learners. Default to 6.
`subsample`	`Optional[float]` Subsample ratio of the training instance. Default to 1.0.
`reg_alpha`	`Optional[float]` L1 regularization term on weights (xgb's alpha). Default to 0.0.
`reg_lambda`	`Optional[float]` L2 regularization term on weights (xgb's lambda). Default to 1.0.
`learning_rate`	`Optional[float]` Boosting learning rate (xgb's "eta"). Default to 0.3.
`max_iterations`	`Optional[int]` Maximum number of rounds for boosting. Default to 20.
`tol`	`Optional[float]` Minimum relative loss improvement necessary to continue training. Default to 0.01.
`enable_global_explain`	`Optional[bool]` Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.
`xgboost_version`	`Optional[str]` Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".

Methods

repr

__repr__()

Print the estimator's constructor with all non-default parameter values.

fit

fit(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    y: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    X_eval: typing.Optional[
        typing.Union[
            bigframes.dataframe.DataFrame,
            bigframes.series.Series,
            pandas.core.frame.DataFrame,
            pandas.core.series.Series,
        ]
    ] = None,
    y_eval: typing.Optional[
        typing.Union[
            bigframes.dataframe.DataFrame,
            bigframes.series.Series,
            pandas.core.frame.DataFrame,
            pandas.core.series.Series,
        ]
    ] = None,
) -> bigframes.ml.base._T

Fit gradient boosting model.

Note that calling fit() multiple times will cause the model object to be re-fit from scratch. To resume training from a previous checkpoint, explicitly pass xgb_model argument.

Parameters
Name	Description
`X`	`bigframes.dataframe.DataFrame or bigframes.series.Series` Series or DataFrame of shape (n_samples, n_features). Training data.
`y`	`bigframes.dataframe.DataFrame or bigframes.series.Series` DataFrame of shape (n_samples,) or (n_samples, n_targets). Target values. Will be cast to X's dtype if necessary.
`X_eval`	`bigframes.dataframe.DataFrame or bigframes.series.Series` Series or DataFrame of shape (n_samples, n_features). Evaluation data.
`y_eval`	`bigframes.dataframe.DataFrame or bigframes.series.Series` DataFrame of shape (n_samples,) or (n_samples, n_targets). Evaluation target values. Will be cast to X_eval's dtype if necessary.

Returns
Type	Description
`XGBModel`	Fitted estimator.

get_params

get_params(deep: bool = True) -> typing.Dict[str, typing.Any]

Get parameters for this estimator.

Parameter
Name	Description
`deep`	`bool, default True` Default `True`. If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
Type	Description
`Dictionary`	A dictionary of parameter names mapped to their values.

predict

predict(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
) -> bigframes.dataframe.DataFrame

Predict using the XGB model.

Parameter
Name	Description
`X`	`bigframes.dataframe.DataFrame or bigframes.series.Series` Series or DataFrame of shape (n_samples, n_features). Samples.

Returns
Type	Description
`bigframes.dataframe.DataFrame`	DataFrame of shape (n_samples, n_input_columns + n_prediction_columns). Returns predicted values.

register

register(vertex_ai_model_id: typing.Optional[str] = None) -> bigframes.ml.base._T

After register, go to the Google Cloud console (https://console.cloud.google.com/vertex-ai/models) to manage the model registries. Refer to https://cloud.google.com/vertex-ai/docs/model-registry/introduction for more options.

Parameter
Name	Description
`vertex_ai_model_id`	`Optional[str], default None` Optional string id as model id in Vertex. If not set, will default to 'bigframes_{bq_model_id}'. Vertex Ai model id will be truncated to 63 characters due to its limitation.

score

score(
    X: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
    y: typing.Union[
        bigframes.dataframe.DataFrame,
        bigframes.series.Series,
        pandas.core.frame.DataFrame,
        pandas.core.series.Series,
    ],
)

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy, which is a harsh metric since you require that each label set be correctly predicted for each sample.

Parameters
Name	Description
`X`	`bigframes.dataframe.DataFrame or bigframes.series.Series` DataFrame of shape (n_samples, n_features). Test samples.
`y`	`bigframes.dataframe.DataFrame or bigframes.series.Series` DataFrame of shape (n_samples,) or (n_samples, n_outputs). True labels for `X`.

Returns
Type	Description
`bigframes.dataframe.DataFrame`	A DataFrame of the evaluation result.

to_gbq

to_gbq(
    model_name: str, replace: bool = False
) -> bigframes.ml.ensemble.XGBClassifier

Save the model to BigQuery.

Parameters
Name	Description
`model_name`	`str` The name of the model.
`replace`	`bool, default False` Determine whether to replace if the model already exists. Default to False.

Returns
Type	Description
`XGBClassifier`	Saved model.