Class KFold (1.20.0)

KFold(n_splits: int = 5, *, random_state: typing.Optional[int] = None)

K-Fold cross-validator.

Split data in train/test sets. Split dataset into k consecutive folds.

Each fold is then used once as a validation while the k - 1 remaining folds form the training set.

Parameters

Name Description
n_splits int

Number of folds. Must be at least 2. Default to 5.

random_state Optional[int]

A seed to use for randomly choosing the rows of the split. If not set, a random split will be generated each time. Default to None.

Methods

get_n_splits

get_n_splits() -> int

Returns the number of splitting iterations in the cross-validator.

Returns
Type Description
int the number of splitting iterations in the cross-validator.

split

split(
    X: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y: typing.Optional[
        typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]
    ] = None,
) -> typing.Generator[
    tuple[
        typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series, NoneType]
    ],
    None,
    None,
]

Generate indices to split data into training and test set.

Parameters
Name Description
X bigframes.dataframe.DataFrame or bigframes.series.Series

BigFrames DataFrame or Series of shape (n_samples, n_features) Training data, where n_samples is the number of samples and n_features is the number of features.

y bigframes.dataframe.DataFrame, bigframes.series.Series or None :Yields: *X_train (bigframes.dataframe.DataFrame or bigframes.series.Series)* -- The training data for that split. X_test (bigframes.dataframe.DataFrame or bigframes.series.Series): The testing data for that split. y_train (bigframes.dataframe.DataFrame, bigframes.series.Series or None): The training label for that split. y_test (bigframes.dataframe.DataFrame, bigframes.series.Series or None): The testing label for that split.

BigFrames DataFrame, Series of shape (n_samples,) or None. The target variable for supervised learning problems. Default to None.