Changelog

PyPI History

0.8.0 (2023-10-12)

⚠ BREAKING CHANGES

The default behavior of to_parquet is changing from no compression to 'snappy' compression.

Features

Support compression in to_parquet (a8c286f)

Bug Fixes

Create session dataset for remote functions only when needed (#94) (1d385be)

0.7.0 (2023-10-11)

Features

Add aliases for several series properties (#80) (c0efec8)
Add equals methods to series/dataframe (#76) (636a209)
Add iat and iloc accessing by tuples of integers (#90) (228aeba)
Add level param to DataFrame.stack (#88) (97b8bec)
Allow df.drop to take an index object (#68) (740c451)
Use default session connection (#87) (4ae4ef9)

Bug Fixes

Change the invalid url in docs (#93) (969800d)

Documentation

Add more preprocessing models into the docs menu. (#97) (1592315)

0.6.0 (2023-10-04)

Features

Add df.unstack (#63) (4a84714)
Add idxmin, idxmax to series, dataframe (#74) (781307e)
Add ml.preprocessing.KBinsDiscretizer (#81) (24c6256)
Add multi-column dataframe merge (#73) (c9fa85c)
Add update and align methods to dataframe (#57) (bf050cf)
Support STRUCT data type with Series.struct.field to extract child fields (#71) (17afac9)

Bug Fixes

Avoid 403 response too large to return error with read_gbq and large query results (#77) (8f3b5b2)
Change return type of Series.loc[scalar] (#40) (fff3d45)
Fix df/series.iloc by list with multiindex (#79) (971d091)

0.5.0 (2023-09-28)

Features

Add DataFrame.kurtosis / DF.kurt method (c1900c2)
Add DataFrame.rolling and DataFrame.expanding methods (c1900c2)
Add items, apply methods to DataFrame. (#43) (3adc1b3)
Add axis param to simple df aggregations (#52) (9cf9972)
Add index dtype, astype, drop, fillna, aggregate attributes. (#38) (1a254a4)
Add ml.preprocessing.LabelEncoder (#50) (2510461)
Add ml.preprocessing.MaxAbsScaler (#56) (14b262b)
Add ml.preprocessing.MinMaxScaler (#64) (392113b)
Add more index methods (#54) (a6e32aa)
Support calculate_p_values parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support class_weights="balanced" in LogisticRegression model (c1900c2)
Support df[column_name] = df_only_one_column (c1900c2)
Support early_stop parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support enable_global_explain parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support l2_reg parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support learn_rate_strategy parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support ls_init_learn_rate parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support max_iterations parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support min_rel_progress parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support optimize_strategy parameter in bigframes.ml.linear_model.LinearRegression (c1900c2)
Support casting string to integer or float (#59) (3502f83)

Bug Fixes

Fix header skipping logic in read_csv (#49) (d56258c)
Generate unique ids on join to avoid id collisions (#65) (7ab65e8)
LabelEncoder params consistent with Sklearn (#60) (632caec)
Loosen filter items tests to accomodate shifting pandas impl (#41) (edabdbb)

Performance Improvements

Add ability to cache dataframe and series to session table (#51) (416d7cb)
Inline small Series and DataFrames in query text (#45) (5e199ec)
Reimplement unpivot to use cross join rather than union (#47) (f9a93ce)
Simplify join order to use multiple order keys instead of string. (#36) (5056da6)

Documentation

Link to Remote Functions code samples from README and API reference (c1900c2)

0.4.0 (2023-09-16)

Features

Add axis parameter to droplevel and reorder_levels (7c6b0dd)
Add bfill and ffill to DataFrame and Series (7c6b0dd)
Add DataFrame.combine and DataFrame.combine_first (#27) (7c6b0dd)
Add DataFrame.nlargest, nsmallest (7c6b0dd)
Add DataFrame.pct_change and Series.pct_change (7c6b0dd)
Add DataFrame.skew and GroupBy.skew (7c6b0dd)
Add DataFrame.to_dict, to_excel, to_latex, to_records, to_string, to_markdown, to_pickle, to_orc (7c6b0dd)
Add diff method to DataFrame and GroupBy (7c6b0dd)
Add filter and reindex to Series and DataFrame (7c6b0dd)
Add reindex_like to DataFrame and Series (7c6b0dd)
Add swaplevel to DataFrame and Series (7c6b0dd)
Add partial support for Sereies.replace (7c6b0dd)
Support DataFrame.loc[bool_series, column] = scalar (7c6b0dd)
Support a persistent name in remote_function (7c6b0dd)

Bug Fixes

remote_function uses same credentials as other APIs (7c6b0dd)
Add type hints to models (7c6b0dd)
Raise error when ARIMAPlus is used with Pipeline (7c6b0dd)
Remove transforms parameter in model.fit (breaking change) (7c6b0dd)
Support column joins with “None indexer” (7c6b0dd)
Use for literals Int64Dtype in cut (7c6b0dd)
Use lowercase strings for parameter literals in bigframes.ml (breaking change) (7c6b0dd)

Performance Improvements

bigframes-api label to I/O query jobs (7c6b0dd)

Documentation

Document possible parameter values for PaLM2TextGenerator (7c6b0dd)
Document region logic in README (7c6b0dd)
Fix OneHotEncoder sample (7c6b0dd)

0.3.2 (2023-09-06)

Bug Fixes

Make release.sh script for PyPI upload executable (#20) (9951610)

0.3.1 (2023-09-05)

Bug Fixes

release: Use correct directory name for release build config (#17) (3dd25b3)

0.3.0 (2023-09-02)

Features

Add bigframes.get_global_session() and bigframes.reset_session() aliases (a32b747)
Add bigframes.pandas.read_pickle function (a32b747)
Add components_, explained_variance_, and explained_variance_ratio_ properties to bigframes.ml.decomposition.PCA (89b9503)
Add fit_transform to bigquery.ml transformers (a32b747)
Add Series.dropna and DataFrame.fillna (8fab755)
Add Series.str methods isalpha, isdigit, isdecimal, isalnum, isspace, islower, isupper, zfill, center (a32b747)
Support bigframes.pandas.merge() (8fab755)
Support DataFrame.isin with list and dict inputs (8fab755)
Support DataFrame.pivot (a32b747)
Support DataFrame.stack (89b9503)
Support DataFrame-DataFrame binary operations (8fab755)
Support df[my_column] = [a python list] (89b9503)
Support Index.is_monotonic (8fab755)
Support np.arcsin, np.arccos, np.arctan, np.sinh, np.cosh, np.tanh, np.arcsinh, np.arccosh, np.arctanh, np.exp with Series argument (89b9503)
Support np.sin, np.cos, np.tan, np.log, np.log10, np.sqrt, np.abs with Series argument (89b9503)
Support pow() and power operator in DataFrame and Series (8fab755)
Support read_json with engine=bigquery for newline-delimited JSON files (89b9503)
Support Series.corr (89b9503)
Support Series.map (8fab755)
Support for np.add, np.subtract, np.multiply, np.divide, np.power (8fab755)
Support MultiIndex for DataFrame columns (a32b747)
Use pandas.Index for column labels (a32b747)
Use default session and connection in ml.llm and ml.imported (8fab755)

Bug Fixes

Add error message to set_index (a32b747)
Align column names with pandas in DataFrame.agg results (89b9503)
Allow (but still not recommended) ORDER BY in read_gbq input when an index_col is defined (89b9503)
Check for IAM role on the BigQuery connection when initializing a remote_function (89b9503)
Check that types are specified in read_gbq_function (a32b747)
Don’t use query cache for Session construction (a32b747)
Include survey link in abstract NotImplementedError exception messages (89b9503)
Label temp table creation jobs with source=bigquery-dataframes-temp label (89b9503)
Make X_train argument names consistent across methods (8fab755)
Raise AttributeError for unimplemented pandas methods (89b9503)
Raise exception for invalid function in read_gbq_function (a32b747)
Support spaces in column names in DataFrame initializater (89b9503)

Performance Improvements

Add local cache for __repr_\*__ methods (a32b747)
Lazily instantiate client library objects (89b9503)
Use row_number() filter for head / tail (8fab755)

Documentation

Add ML section under Overview (a32b747)
Add release status to table of contents (a32b747)
Add samples and best practices to read_gbq docs (a32b747)
Correct the return types of Dataframe and Series (a32b747)
Create subfolders for notebooks (a32b747)
Fix link to GitHub (89b9503)
Highlight bigframes is open-source (a32b747)
Sample ML Drug Name Generation notebook (a32b747)
Set options.bigquery.project in sample code (89b9503)
Transform remote function user guide into sample code (a32b747)
Update remote function notebook with read_gbq_function usage (8fab755)

0.2.0 (2023-08-17)

Features

Add KMeans.cluster_centers_.
Allow column labels to be any type handled by bq df, column labels can be integers now.
Add dataframegroupby.agg().
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
Add match, fullmatch, get, pad str methods.
Add series isin function.

Bug Fixes

Update ML package to use sessions for queries.
Optimize read_gbq with index_col set to cluster by index_col.
Raise ValueError if the location mismatched.
read_gbq no longer uses ‘time travel’ with query inputs.

Documentation

Add docstring to _uniform_sampling to avoid user using it.

0.1.1 (2023-08-14)

Documentation

Correct link to code repository in setup.py and use correct terminology for console.cloud.google.com links.

0.1.0 (2023-08-11)

Features

Add bigframes.pandas package with an API compatible with pandas. Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.
Add bigframes.ml package with an API inspired by scikit-learn. Train machine learning models and run batch predicition, powered by BigQuery ML.

0.0.0 (2023-02-22)

Empty package to reserve package name.