Class SeriesGroupBy (1.4.0)

SeriesGroupBy(
    block: bigframes.core.blocks.Block,
    value_column: str,
    by_col_ids: typing.Sequence[str],
    value_name: typing.Hashable = None,
    dropna=True,
)

Class for grouping and aggregating relational data.

Methods

agg

agg(
    func=None,
) -> typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]

Aggregate using one or more operations.

aggregate

aggregate(
    func=None,
) -> typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]

API documentation for aggregate method.

all

all() -> bigframes.series.Series

Return True if all values in the group are true, else False.

Returns
Type	Description
`Series or DataFrame`	DataFrame or Series of boolean values, where a value is True if all elements are True within its respective group; otherwise False.

any

any() -> bigframes.series.Series

Return True if any value in the group is true, else False.

Returns
Type	Description
`Series or DataFrame`	DataFrame or Series of boolean values, where a value is True if any element is True within its respective group; otherwise False.

count

count() -> bigframes.series.Series

Compute count of group, excluding missing values.

Returns
Type	Description
`Series or DataFrame`	Count of values within each group.

cumcount

cumcount(*args, **kwargs) -> bigframes.series.Series

Number each item in each group from 0 to the length of that group - 1.

Parameter
Name	Description
`ascending`	`bool, default True` If False, number in reverse, from length of group - 1 to 0.

Returns
Type	Description
`Series`	Sequence number of each element within each group.

cummax

cummax(*args, **kwargs) -> bigframes.series.Series

Cumulative max for each group.

Returns
Type	Description
`Series or DataFrame`	Cumulative max for each group.

cummin

cummin(*args, **kwargs) -> bigframes.series.Series

Cumulative min for each group.

Returns
Type	Description
`Series or DataFrame`	Cumulative min for each group.

cumprod

cumprod(*args, **kwargs) -> bigframes.series.Series

Cumulative product for each group.

Returns
Type	Description
`Series or DataFrame`	Cumulative product for each group.

cumsum

cumsum(*args, **kwargs) -> bigframes.series.Series

Cumulative sum for each group.

Returns
Type	Description
`Series or DataFrame`	Cumulative sum for each group.

diff

diff(periods=1) -> bigframes.series.Series

First discrete difference of element. Calculates the difference of each element compared with another element in the group (default is element in previous row).

Returns
Type	Description
`Series or DataFrame`	First differences.

expanding

expanding(min_periods: int = 1) -> bigframes.core.window.Window

Provides expanding functionality.

Returns
Type	Description
`Series or DataFrame`	An expanding grouper, providing expanding functionality per group.

kurt

kurt(*args, **kwargs) -> bigframes.series.Series

Return unbiased kurtosis over requested axis.

Kurtosis obtained using Fisher's definition of kurtosis (kurtosis of normal == 0.0). Normalized by N-1.

Parameter
Name	Description
`numeric_only`	`bool, default False` Include only `float`, `int` or `boolean` data.

kurtosis

kurtosis(*args, **kwargs) -> bigframes.series.Series

API documentation for kurtosis method.

max

max(*args) -> bigframes.series.Series

Compute max of group values.

Parameters
Name	Description
`numeric_only`	`bool, default False` Include only float, int, boolean columns.
`min_count`	`int, default 0` The required number of valid values to perform the operation. If fewer than `min_count` and non-NA values are present, the result will be NA.

Returns
Type	Description
`Series or DataFrame`	Computed max of values within each group.

mean

mean(*args) -> bigframes.series.Series

Compute mean of groups, excluding missing values.

Parameter
Name	Description
`numeric_only`	`bool, default False` Include only float, int, boolean columns.

Returns
Type	Description
`pandas.Series or pandas.DataFrame`	Mean of groups.

median

median(*args, exact: bool = True, **kwargs) -> bigframes.series.Series

Compute median of groups, excluding missing values.

Parameters
Name	Description
`numeric_only`	`bool, default False` Include only float, int, boolean columns.
`exact`	`bool, default True` Calculate the exact median instead of an approximation.

Returns
Type	Description
`pandas.Series or pandas.DataFrame`	Median of groups.

min

min(*args) -> bigframes.series.Series

Compute min of group values.

Parameters
Name	Description
`numeric_only`	`bool, default False` Include only float, int, boolean columns.
`min_count`	`int, default 0` The required number of valid values to perform the operation. If fewer than `min_count` and non-NA values are present, the result will be NA.

Returns
Type	Description
`Series or DataFrame`	Computed min of values within each group.

nunique

nunique() -> bigframes.series.Series

Return number of unique elements in the group.

Returns
Type	Description
`Series`	Number of unique values within each group.

prod

prod(*args) -> bigframes.series.Series

Compute prod of group values.

Parameters
Name	Description
`numeric_only`	`bool, default False` Include only float, int, boolean columns.
`min_count`	`int, default 0` The required number of valid values to perform the operation. If fewer than `min_count` and non-NA values are present, the result will be NA.

Returns
Type	Description
`Series or DataFrame`	Computed prod of values within each group.

quantile

quantile(
    q: typing.Union[float, typing.Sequence[float]] = 0.5, *, numeric_only: bool = False
) -> bigframes.series.Series

Return group values at the given quantile, a la numpy.percentile.

Examples:

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame([
...     ['a', 1], ['a', 2], ['a', 3],
...     ['b', 1], ['b', 3], ['b', 5]
... ], columns=['key', 'val'])
>>> df.groupby('key').quantile()
     val
key
a    2.0
b    3.0
<BLANKLINE>
[2 rows x 1 columns]

Parameters
Name	Description
`q`	`float or array-like, default 0.5 (50% quantile)` Value(s) between 0 and 1 providing the quantile(s) to compute.
`numeric_only`	`bool, default False` Include only `float`, `int` or `boolean` data.

Returns
Type	Description
`Series or DataFrame`	Return type determined by caller of GroupBy object.

rolling

rolling(window: int, min_periods=None) -> bigframes.core.window.Window

Returns a rolling grouper, providing rolling functionality per group.

Parameter
Name	Description
`min_periods`	`int, default None` Minimum number of observations in window required to have a value; otherwise, result is `np.nan`. For a window that is specified by an offset, `min_periods` will default to 1. For a window that is specified by an integer, `min_periods` will default to the size of the window.

Returns
Type	Description
`Series or DataFrame`	Return a new grouper with our rolling appended.

shift

shift(periods=1) -> bigframes.series.Series

Shift index by desired number of periods.

skew

skew(*args, **kwargs) -> bigframes.series.Series

Return unbiased skew within groups.

Normalized by N-1.

Parameter
Name	Description
`numeric_only`	`bool, default False` Include only `float`, `int` or `boolean` data.

std

std(*args, **kwargs) -> bigframes.series.Series

Compute standard deviation of groups, excluding missing values.

For multiple groupings, the result index will be a MultiIndex.

Parameter
Name	Description
`numeric_only`	`bool, default False` Include only `float`, `int` or `boolean` data.

Returns
Type	Description
`Series or DataFrame`	Standard deviation of values within each group.

sum

sum(*args) -> bigframes.series.Series

Compute sum of group values.

Parameters
Name	Description
`numeric_only`	`bool, default False` Include only float, int, boolean columns.
`min_count`	`int, default 0` The required number of valid values to perform the operation. If fewer than `min_count` and non-NA values are present, the result will be NA.

Returns
Type	Description
`Series or DataFrame`	Computed sum of values within each group.

var

var(*args, **kwargs) -> bigframes.series.Series

Compute variance of groups, excluding missing values.

For multiple groupings, the result index will be a MultiIndex.

Parameter
Name	Description
`numeric_only`	`bool, default False` Include only `float`, `int` or `boolean` data.