Class SeriesGroupBy (1.3.0)

SeriesGroupBy(
    block: bigframes.core.blocks.Block,
    value_column: str,
    by_col_ids: typing.Sequence[str],
    value_name: typing.Hashable = None,
    dropna=True,
)

Class for grouping and aggregating relational data.

Methods

agg

agg(
    func=None,
) -> typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]

Aggregate using one or more operations.

aggregate

aggregate(
    func=None,
) -> typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series]

API documentation for aggregate method.

all

all() -> bigframes.series.Series

Return True if all values in the group are true, else False.

Returns
TypeDescription
Series or DataFrameDataFrame or Series of boolean values, where a value is True if all elements are True within its respective group; otherwise False.

any

any() -> bigframes.series.Series

Return True if any value in the group is true, else False.

Returns
TypeDescription
Series or DataFrameDataFrame or Series of boolean values, where a value is True if any element is True within its respective group; otherwise False.

count

count() -> bigframes.series.Series

Compute count of group, excluding missing values.

Returns
TypeDescription
Series or DataFrameCount of values within each group.

cumcount

cumcount(*args, **kwargs) -> bigframes.series.Series

Number each item in each group from 0 to the length of that group - 1.

Parameter
NameDescription
ascending bool, default True

If False, number in reverse, from length of group - 1 to 0.

Returns
TypeDescription
SeriesSequence number of each element within each group.

cummax

cummax(*args, **kwargs) -> bigframes.series.Series

Cumulative max for each group.

Returns
TypeDescription
Series or DataFrameCumulative max for each group.

cummin

cummin(*args, **kwargs) -> bigframes.series.Series

Cumulative min for each group.

Returns
TypeDescription
Series or DataFrameCumulative min for each group.

cumprod

cumprod(*args, **kwargs) -> bigframes.series.Series

Cumulative product for each group.

Returns
TypeDescription
Series or DataFrameCumulative product for each group.

cumsum

cumsum(*args, **kwargs) -> bigframes.series.Series

Cumulative sum for each group.

Returns
TypeDescription
Series or DataFrameCumulative sum for each group.

diff

diff(periods=1) -> bigframes.series.Series

First discrete difference of element. Calculates the difference of each element compared with another element in the group (default is element in previous row).

Returns
TypeDescription
Series or DataFrameFirst differences.

expanding

expanding(min_periods: int = 1) -> bigframes.core.window.Window

Provides expanding functionality.

Returns
TypeDescription
Series or DataFrameAn expanding grouper, providing expanding functionality per group.

kurt

kurt(*args, **kwargs) -> bigframes.series.Series

Return unbiased kurtosis over requested axis.

Kurtosis obtained using Fisher's definition of kurtosis (kurtosis of normal == 0.0). Normalized by N-1.

Parameter
NameDescription
numeric_only bool, default False

Include only float, int or boolean data.

kurtosis

kurtosis(*args, **kwargs) -> bigframes.series.Series

API documentation for kurtosis method.

max

max(*args) -> bigframes.series.Series

Compute max of group values.

Parameters
NameDescription
numeric_only bool, default False

Include only float, int, boolean columns.

min_count int, default 0

The required number of valid values to perform the operation. If fewer than min_count and non-NA values are present, the result will be NA.

Returns
TypeDescription
Series or DataFrameComputed max of values within each group.

mean

mean(*args) -> bigframes.series.Series

Compute mean of groups, excluding missing values.

Parameter
NameDescription
numeric_only bool, default False

Include only float, int, boolean columns.

Returns
TypeDescription
pandas.Series or pandas.DataFrameMean of groups.

median

median(*args, exact: bool = True, **kwargs) -> bigframes.series.Series

Compute median of groups, excluding missing values.

Parameters
NameDescription
numeric_only bool, default False

Include only float, int, boolean columns.

exact bool, default True

Calculate the exact median instead of an approximation.

Returns
TypeDescription
pandas.Series or pandas.DataFrameMedian of groups.

min

min(*args) -> bigframes.series.Series

Compute min of group values.

Parameters
NameDescription
numeric_only bool, default False

Include only float, int, boolean columns.

min_count int, default 0

The required number of valid values to perform the operation. If fewer than min_count and non-NA values are present, the result will be NA.

Returns
TypeDescription
Series or DataFrameComputed min of values within each group.

nunique

nunique() -> bigframes.series.Series

Return number of unique elements in the group.

Returns
TypeDescription
SeriesNumber of unique values within each group.

prod

prod(*args) -> bigframes.series.Series

Compute prod of group values.

Parameters
NameDescription
numeric_only bool, default False

Include only float, int, boolean columns.

min_count int, default 0

The required number of valid values to perform the operation. If fewer than min_count and non-NA values are present, the result will be NA.

Returns
TypeDescription
Series or DataFrameComputed prod of values within each group.

quantile

quantile(
    q: typing.Union[float, typing.Sequence[float]] = 0.5, *, numeric_only: bool = False
) -> bigframes.series.Series

Return group values at the given quantile, a la numpy.percentile.

Examples:

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame([
...     ['a', 1], ['a', 2], ['a', 3],
...     ['b', 1], ['b', 3], ['b', 5]
... ], columns=['key', 'val'])
>>> df.groupby('key').quantile()
     val
key
a    2.0
b    3.0
<BLANKLINE>
[2 rows x 1 columns]
Parameters
NameDescription
q float or array-like, default 0.5 (50% quantile)

Value(s) between 0 and 1 providing the quantile(s) to compute.

numeric_only bool, default False

Include only float, int or boolean data.

Returns
TypeDescription
Series or DataFrameReturn type determined by caller of GroupBy object.

rolling

rolling(window: int, min_periods=None) -> bigframes.core.window.Window

Returns a rolling grouper, providing rolling functionality per group.

Parameter
NameDescription
min_periods int, default None

Minimum number of observations in window required to have a value; otherwise, result is np.nan. For a window that is specified by an offset, min_periods will default to 1. For a window that is specified by an integer, min_periods will default to the size of the window.

Returns
TypeDescription
Series or DataFrameReturn a new grouper with our rolling appended.

shift

shift(periods=1) -> bigframes.series.Series

Shift index by desired number of periods.

skew

skew(*args, **kwargs) -> bigframes.series.Series

Return unbiased skew within groups.

Normalized by N-1.

Parameter
NameDescription
numeric_only bool, default False

Include only float, int or boolean data.

std

std(*args, **kwargs) -> bigframes.series.Series

Compute standard deviation of groups, excluding missing values.

For multiple groupings, the result index will be a MultiIndex.

Parameter
NameDescription
numeric_only bool, default False

Include only float, int or boolean data.

Returns
TypeDescription
Series or DataFrameStandard deviation of values within each group.

sum

sum(*args) -> bigframes.series.Series

Compute sum of group values.

Parameters
NameDescription
numeric_only bool, default False

Include only float, int, boolean columns.

min_count int, default 0

The required number of valid values to perform the operation. If fewer than min_count and non-NA values are present, the result will be NA.

Returns
TypeDescription
Series or DataFrameComputed sum of values within each group.

var

var(*args, **kwargs) -> bigframes.series.Series

Compute variance of groups, excluding missing values.

For multiple groupings, the result index will be a MultiIndex.

Parameter
NameDescription
numeric_only bool, default False

Include only float, int or boolean data.