Class SamplingOptions (0.17.0)

SamplingOptions(
    max_download_size: typing.Optional[int] = 500,
    enable_downsampling: bool = False,
    sampling_method: typing.Literal["head", "uniform"] = "uniform",
    random_state: typing.Optional[int] = None,
)

Encapsulates configuration for data sampling.

Attributes

NameDescription
max_download_size int, default 500
Download size threshold in MB. If value set to None, the download size won't be checked.
enable_downsampling bool, default False
Whether to enable downsampling, If max_download_size is exceeded when downloading data (e.g., to_pandas()), the data will be downsampled if enable_downsampling is True, otherwise, an error will be raised.
sampling_method str, default "uniform"
Downsampling algorithms to be chosen from, the choices are: "head": This algorithm returns a portion of the data from the beginning. It is fast and requires minimal computations to perform the downsampling.; "uniform": This algorithm returns uniform random samples of the data.
random_state int, default None
The seed for the uniform downsampling algorithm. If provided, the uniform method may take longer to execute and require more computation.