- 1.31.0 (latest)
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
Changelog
1.30.0 (2024-12-30)
Features
Add
LinearRegression.predict_explain()
to generateML.EXPLAIN_PREDICT
columns (#1190) (e13eca2)Add
LogisticRegression.predict_explain()
to generateML.EXPLAIN_PREDICT
columns (#1222) (bcbc732)Add
write_engine
parameter toread_FORMATNAME
methods to control how data is written to BigQuery (#371) (ed47ef1)Add client side retry to GeminiTextGenerator (#1242) (8193abe)
Add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 (#1208) (298fc73)
Add support for
LinearRegression.predict_explain
andLogisticRegression.predict_explain
parameter,top_k_features
(#1228) (3068e19)
Bug Fixes
Throw an error message when setting is_row_processor=True to read a multi param function (#1160) (b2816a5)
Documentation
Add an “open in BQ Studio” link to all BigFrames sample notebooks (#1223) (e0a8288)
Add bq studio link for a new ipynb file called “bq_dataframes_template.ipynb” (#1239) (840aaff)
Add python snippet for “Create the time series model” section of the Forecast a single time series with a univariate model tutorial (#1227) (20f3190)
1.29.0 (2024-12-12)
Features
Documentation
1.28.0 (2024-12-11)
Features
bigframes.bigquery.vector_search
supportsuse_brute_force
andfraction_lists_to_search
parameters (#1158) (131edc3)Add
ARIMAPlus.predict_explain()
to generate forecasts with explanation columns (#1177) (05f8b4d)Add client_endpoints_override to bq options (#1167) (be74b99)
Add support for temporal types in dataframe’s describe() method (#1189) (2d564a6)
Allow join-free alignment of analytic expressions (#1168) (daef4f0)
Bug Fixes
Performance Improvements
Documentation
Add a code sample using
bpd.options.bigquery.ordering_mode = "partial"
(#909) (f80d705)Add snippet for creating boosted tree model (#1142) (a972668)
Add snippet for evaluating a boosted tree model (#1154) (9d8970a)
Add snippet for predicting classifications using a boosted tree model (#1156) (e7b83f1)
Add third party
pandas.Index methods
and docstrings (#1171) (a970294)Fix Bigframes.Pandas.General_Function missing docs (#1164) (de923d0)
1.27.0 (2024-11-16)
Features
Bug Fixes
Documentation
1.26.0 (2024-11-12)
Features
Bug Fixes
Fix Series.to_frame generating string label instead of int where name is None (#1118) (14e32b5)
Update the API documentation with newly added rep (#1120) (72c228b)
Performance Improvements
Documentation
Add file for Classification with a Boosted Treed Model and snippet for preparing sample data (#1135) (7ac6639)
Add snippet for Linear Regression tutorial Predict Outcomes section (#1101) (108f4a9)
Update
DataFrame
docstrings to include the errors section (#1127) (a38d4c4)Update Session doctrings to include exceptions (#1130) (a870421)
1.25.0 (2024-10-29)
Features
Add the
ground_with_google_search
option for GeminiTextGenerator predict (#1119) (ca02cd4)Add warning when user tries to access struct series fields with
__getitem__
(#1082) (20e5c58)Allow
fit
to take additional eval data in linear and ensemble models (#1096) (254875c)Support context manager for bigframes session (#1107) (5f7b8b1)
Performance Improvements
1.24.0 (2024-10-24)
Features
Documentation
1.23.0 (2024-10-23)
Features
Add
bigframes.bigquery.create_vector_index
to assist in creating vector index onARRAY<FLOAT64>
columns (#1024) (863d694)Add gemini-1.5-pro-002 and gemini-1.5-flash-002 to known Gemini model list. (#1105) (7094c85)
Add support for pandas series & data frames as inputs for ml models. (#1088) (30c8883)
Cleanup temp resources with session deletion (#1068) (1d5373d)
Show possible correct key(s) in
.__getitem__
KeyError message (#1097) (32fab96)
Bug Fixes
Performance Improvements
Speed up tree transforms during sql compile (#1071) (d73fe9d)
Utilize ORDER BY LIMIT over ROW_NUMBER where possible (#1077) (7003d1a)
Documentation
Show best practice of closing the session to cleanup resources in sample notebooks (#1095) (62a88e8)
Update docstrings of Session and related files (#1087) (bf93e80)
1.22.0 (2024-10-09)
Features
Support regional endpoints for more bigquery locations (#1061) (45b672a)
Update LLM generators to warn user about model name instead of raising error. (#1048) (650d80d)
Bug Fixes
Correct zero row count in DataFrame from table view (#1062) (b536070)
Fix generic error message when entering an incorrect column name (#1031) (5ac217d)
Make invalid location warning case-insensitive (#1044) (b6cd55a)
Show warning for unknown location set through .ctor (#1052) (02c2da7)
Performance Improvements
Documentation
1.21.0 (2024-10-02)
Features
Add deprecation warning to PaLM2TextGenerator model (#1035) (1183b0f)
Add DeprecationWarning for PaLM2TextEmbeddingGenerator (#1018) (4af5bbb)
Add ml.model_selection.cross_validate support (#1020) (1a38063)
Allow access of struct fields with dot operators on
Series
(#1019) (ef76f13)
Bug Fixes
Documentation
1.20.0 (2024-09-25)
Features
Add bigframes.ml.compose.SQLScalarColumnTransformer to create custom SQL-based transformations (#955) (1930b4e)
Allow multiple columns input for llm models (#998) (2fe5e48)
Bug Fixes
Documentation
Limit pypi notebook to 7 days and add more info about differences with partial ordering mode (#1013) (3c54399)
Move and edit existing linear-regression tutorial snippet (#991) (4cb62fd)
1.19.0 (2024-09-24)
Features
Support bool and bytes types in
describe(include='all')
(#994) (cc48f58)Support ingress settings in
remote_function
(#1011) (8e9919b)
Bug Fixes
Performance Improvements
Dependencies
1.18.0 (2024-09-18)
Features
Add “include” param to describe for string types (#973) (deac6d2)
Add
subset
parameter toDataFrame.dropna
to select which columns to consider (#981) (f7c03dc)
Bug Fixes
DataFrameGroupby.agg now works with unnamed tuples (#985) (0f047b4)
Fix a bug that raises exception when re-indexing columns with their original order (#988) (596b03b)
Make the
Series.apply
outcomeassign
able to the original dataframe in partial ordering mode (#874) (c94ead9)
Dependencies
1.17.0 (2024-09-11)
Features
Include the bigframes package version alongside the feedback link in error messages (#936) (7b59b6d)
Bug Fixes
Make
read_gbq_function
work for multi-param functions (#947) (c750be6)Support
read_gbq_function
for axis=1 application (#950) (86e54b1)
Documentation
1.16.0 (2024-09-04)
Features
Add
DataFrame.struct.explode
to add struct subfields to a DataFrame (#916) (ad2f75e)Implement
bigframes.bigquery.json_extract_array
(#910) (575a29e)
Bug Fixes
Fix issue with iterating on >10gb dataframes (#949) (2b0f0fa)
Unordered mode errors in ml train_test_split (#925) (85d7c21)
Performance Improvements
Dependencies
Documentation
Create sample notebook to manipulate struct and array data (#883) (3031903)
Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook (#890) (d1883cc)
1.15.0 (2024-08-20)
Features
Documentation
Add columns for “requires ordering/index” to supported APIs summary (#892) (d2fc51a)
Remove duplicate description for
kms_key_name
(#898) (1053d56)
1.14.0 (2024-08-14)
Features
Bug Fixes
Performance Improvements
Documentation
1.13.0 (2024-08-05)
Features
df.apply(axis=1)
to support remote function with mutiple params (#851) (2158818)Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters (#879) (8753bdd)
Bug Fixes
Documentation
1.12.0 (2024-07-31)
Features
Add config option to set partial ordering mode (#855) (823c0ce)
Add stratify param support to ml.model_selection.train_test_split method (#815) (27f8631)
Allow DataFrame.join for self-join on Null index (#860) (e950533)
Support remote function cleanup with
session.close
(#818) (ed06436)Support to_csv/parquet/json to local files/objects (#858) (d0ab9cc)
Bug Fixes
Fewer relation joins from df self-operations (#823) (0d24f73)
Fix unordered mode using ordered path to print frame (#839) (93785cb)
Reduce redundant
remote_function
deployments (#856) (cbf2d42)
Documentation
Add partner attribution steps to integrations sample notebook (#835) (d7b333f)
Make
get_global_session
/close_session
/reset_session
appears in the docs (#847) (01d6bbb)
1.11.1 (2024-07-08)
Documentation
Remove session and connection in llm notebook (#821) (74170da)
Remove the experimental flask icon from the public docs (#820) (067ff17)
1.11.0 (2024-07-01)
Features
Add
bigframes.streaming.to_pubsub
method to create continuous query that writes to Pub/Sub (#801) (b47f32d)Add
DataFrame.to_arrow
to create Arrow Table from DataFrame (#807) (1e3feda)Add
PolynomialFeatures
support toto_gbq
and pipelines (#805) (57d98b9)Add Series.peek to preview data efficiently (#727) (580e1b9)
More informative error when query plan too complex (#811) (136dc24)
Bug Fixes
Documentation
1.10.0 (2024-06-21)
Features
Add ml.preprocessing.PolynomialFeatures class (#793) (b4fbb51)
Bigframes.streaming module for continuous queries (#703) (0433a1c)
Include index columns in DataFrame.sql if they are named (#788) (c8d16c0)
Bug Fixes
Allow
__repr__
to work with uninitialed DataFrame/Series/Index (#778) (e14c7a9)Df.loc with the 2nd input as bigframes boolean Series (#789) (a4ac82e)
Ensure numpy version matches in
remote_function
deployment (#798) (324d93c)Fix temp table creation retries by now throwing if table already exists. (#787) (0e57d1f)
Self-join optimization doesn’t needlessly invalidate caching (#797) (1b96b80)
1.9.0 (2024-06-10)
Features
Bug Fixes
Improve to_pandas_batches for large results (#746) (61f18cb)
Resolve issue with unset thread-local options (#741) (d93dbaf)
Documentation
1.8.0 (2024-05-31)
Features
merge
only generates a default index if both inputs already have an index (#733) (25d049c)Add
GroupBy.size()
to get number of rows in each group (#479) (1fca588)Add slot_millis and add stats to session object (#725) (72e9583)
Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings (#731) (f12c906)
Allow functions decorated with
bpd.remote_function()
to execute locally (#704) (d850da6)Ensure
"bigframes-api"
label is always set on jobs, even if the API is unknown (#722) (1832778)Support type annotations to supply input and output types to
bpd.remote_function()
decorator (#717) (4a12e3c)Support type annotations with
bpd.remote_function()
andaxis=1
(a preview feature) (#730) (e5a2992)
Bug Fixes
Correct index labels in multiple aggregations for DataFrameGroupBy (#723) (6a78c89)
Set
bpd.remote_function()
sinput_types
andoutput_types
default toNone
to allow omitting them when type annotations are present (#729) (0e25a3b)Warn and disable time travel for linked datasets (#712) (085fa9d)
Performance Improvements
Documentation
1.7.0 (2024-05-20)
Features
read_gbq_query
supportsfilters
(9386373)read_gbq
suggests a correct column name when one is not found (9386373)Add
DefaultIndexKind.NULL
to use asindex_col
inread_gbq\*
, creating an indexless DataFrame/Series (#662) (29e4886)Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) (#663) (412f28b)
To_datetime supports utc=False for string inputs (#579) (adf9889)
Bug Fixes
read_gbq_table
respects primary keys even whenfilters
are set (#689) (9386373)Improve escaping of literals and identifiers (#682) (da9b136)
Properly identify non-unique index in tables without primary keys (#699) (6e0f4d8)
Remove a usage of the
resource
package when not available, such as on Windows (#681) (96243f2)
Performance Improvements
Don’t run query immediately from
read_gbq_table
iffilters
is set (9386373)Use a
LIMIT
clause whenmax_results
is set (9386373)
Documentation
Add code snippets for imported onnx tutorials (#684) (cb36e46)
Add code snippets for imported tensorflow model (#679) (b02c401)
Use
class_weight="balanced"
in the logistic regression prediction tutorial (#678) (b951549)
1.6.0 (2024-05-13)
Features
Add
strategy="quantile"
in KBinsDiscretizer (#654) (c6c487f)Suggest correct options in bpd.options.bigquery.location (#666) (57ccabc)
Support
axis=1
indf.apply
for scalar outputs (#629) (f6bdc4a)Support gcf vpc connector in
remote_function
(#677) (9ca92d0)Warn with a more specific
DefaultLocationWarning
category when no location can be detected (#648) (e084e54)
Bug Fixes
Dependencies
- Add jellyfish as a dependency for spelling correction (57ccabc)
Documentation
1.5.0 (2024-05-07)
Features
bigframes.options
andbigframes.option_context
now uses thread-local variables to prevent context managers in separate threads from affecting each other (#652) (651fd7d)Add
ARIMAPlus.coef_
property exposingML.ARIMA_COEFFICIENTS
functionality (#585) (81d1262)Add a unique session_id to Session and allow cleaning up sessions (#553) (c8d4e23)
Add the
bigframes.bigquery
sub-package with abigframes.bigquery.array_length
function (#630) (9963f85)Always do a query dry run when
option.repr_mode == "deferred"
(#652) (651fd7d)Warn with
DefaultIndexWarning
fromread_gbq
on clustered/partitioned tables with noindex_col
orfilters
set (#631, #658) (2715d2b, 73064dd)Support
index_col=False
inread_csv
andengine="bigquery"
(73064dd)Support gcf max instance count in
remote_function
(#657) (36578ab)
Bug Fixes
Don’t raise UnknownLocationWarning for US or EU multi-regions (#653) (8e4616b)
Fix bug with na in the column labels in stack (#659) (4a34293)
Documentation
Add python code sample for multiple forecasting time series (#531) (16866d2)
Fix the Palm2TextGenerator output token size (#649) (c67e501)
1.4.0 (2024-04-29)
Features
Add .cache() method to persist intermediate dataframe (#626) (a5c94ec)
Add transpose support for small homogeneously typed DataFrames. (#621) (054075d)
Series binary ops compatible with more types (#618) (518d315)
Support the
score
method forPaLM2TextGenerator
(#634) (3ffc1d2)
Bug Fixes
Performance Improvements
Automatically condense internal expression representation (#516) (03c1b0d)
Cache transpose to allow performant retranspose (#635) (44b738d)
Documentation
Add the first sample for the Single time-series forecasting from Google Analytics data tutorial (#623) (2b84c4f)
1.3.0 (2024-04-22)
Features
Add fine tuning
fit()
for Palm2TextGenerator (#616) (9c106bd)Expose
max_batching_rows
inremote_function
(#622) (240a1ac)Support primary key(s) in
read_gbq
by using as theindex_col
by default (#625) (75bb240)Warn if location is set to unknown location (#609) (3706b4f)
Bug Fixes
Documentation
Fix rendering of examples for multiple apis (#620) (9665e39)
Set
index_cols
inread_gbq
as a best practice (#624) (70015b7)
1.2.0 (2024-04-15)
Features
Bug Fixes
Documentation
1.1.0 (2024-04-04)
Features
Add support for numpy expm1, log1p, floor, ceil, arctan2 ops (#505) (e8e66cf)
Allow DataFrame binary ops to align on either axis and with loc… (#544) (6d8f3af)
Expose
DataFrame.bqclient
to assist in integrations (#519) (0be8911)Read_pandas accepts pandas Series and Index objects (#573) (f8821fe)
Support
ML.GENERATE_EMBEDDING
inPaLM2TextEmbeddingGenerator
(#539) (1156c1e)Support max_columns in repr and make repr more efficient (#515) (54e49cf)
Bug Fixes
Don’t download 100gb onto local python machine in load test (#537) (082c58b)
Exclude list-like s parameter in plot.scatter (#568) (1caac27)
Fix case where df.peek would fail to execute even with force=True (#511) (8eca99a)
Plot.scatter s parameter cannot accept float-like column (#563) (8d39187)
Product operation produces float result for all input types (#501) (6873b30)
Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible (#561) (4995c00)
Respect hard stack size limit and swallow limit change exception. (#558) (4833908)
Use bytes limit on frame inlining rather than element count (#576) (659a161)
Performance Improvements
Dependencies
Documentation
bigframes.options.bigquery.project
andlocation
are optional in some circumstances (#548) (90bcec5)Add “Supported pandas APIs” reference to the documentation (#542) (74c3915)
Add the code samples for metrics{auc, roc_auc_score, roc_curve} (#520) (5f37b09)
Address more comments from technical writers to meet legal purposes (#571) (9084df3)
Migrate the overview page to Bigframes official landing page (#536) (a0fb8bb)
1.0.0 (2024-03-25)
⚠ BREAKING CHANGES
rename model parameter
min_rel_progress
totol
early_stop
setting no longer supported, always usesTrue
rename model parameter
n_parallell_trees
ton_estimators
rename
class_weights
toclass_weight
rename
learn_rate
tolearning_rate
PCA
n_components
supports float value andNone
, default toNone
rename various ml model parameters for consistency with sklearn (https://github.com/googleapis/python-bigquery-dataframes/pull/491)
Features
Allow assigning directly to Series.name property (#495) (ad0e99e)
Ensure
Series.str.len()
can get length of array columns (#497) (10c0446)PCA
n_components
supports float value andNone
, default toNone
(65c6f47)Rename
class_weights
toclass_weight
(65c6f47)Rename
learn_rate
tolearning_rate
(65c6f47)Rename model parameter
min_rel_progress
totol
(65c6f47)Rename model parameter
n_parallell_trees
ton_estimators
(65c6f47)Rename various ml model parameters for consistency with sklearn (https://github.com/googleapis/python-bigquery-dataframes/pull/491) (65c6f47)
Support BQ regional endpoints for europe-west9, europe-west3, us-east4, and us-west1 (#504) (fbada4a)
Bug Fixes
early_stop
setting no longer supported, always usesTrue
(65c6f47)Properly support format param for numerical input. (#486) (ae20c35)
Sampling plot cannot preserve ordering if index is not ordered (#475) (a5345fe)
Use actual BigQuery types rather than ibis types in to_pandas (#500) (82b4f91)
Dependencies
Documentation
Add code samples for metrics.{accuracy_score, confusion_matrix} (#478) (3e3329a)
Add code samples for metrics.{recall_score, precision_score, f11_score} (#502) (370fe90)
Update LLM + K-means notebook to handle partial failures (#496) (97afad9)
0.26.0 (2024-03-20)
⚠ BREAKING CHANGES
- exclude remote models for .register() (#465)
Features
read_gbq_table
supportsLIKE
as a operator infilters
(#454) (d2d425a)Set
force=True
by default inDataFrame.peek()
(#469) (4e8e97d)Support datetime related casting in (Series|DataFrame|Index).astype (#442) (fde339b)
Bug Fixes
Any() on empty set now correctly returns False (#471) (f55680c)
Fix grouping series on multiple other series (#455) (3971bd2)
Groupby aggregates no longer check if grouping keys are numeric (#472) (4fbf938)
Raise
ValueError
whenread_pandas()
receives a bigframesDataFrame
(#447) (b28f9fd)Series.(to_csv|to_json) leverages bq export (#452) (718a00c)
Warn when
read_gbq
/read_gbq_table
uses the snapshot time cache (#441) (e16a8c0)
Documentation
0.25.0 (2024-03-14)
Features
(Series|DataFrame).plot.(line|area|scatter) (#431) (0772510)
Support CMEK for
remote_function
cloud functions (#430) (2fd69f4)
0.24.0 (2024-03-12)
⚠ BREAKING CHANGES
read_parquet
uses a “pandas” engine to parse files by default. Useengine="bigquery"
for the previous behavior
Features
Bug Fixes
Move
third_party.bigframes_vendored
tobigframes_vendored
(#424) (763edeb)Only do row identity based joins when joining by index (#356) (76b252f)
Documentation
Add predict sample to samples/snippets/bqml_getting_started_test.py (#388) (6a3b0cc)
Fix the note rendering for DataFrames methods: nlargest, nsmallest (#417) (38bd2ba)
0.23.0 (2024-03-05)
Features
Bug Fixes
Dependencies
Documentation
0.22.0 (2024-02-27)
⚠ BREAKING CHANGES
rename cosine_similarity to paired_cosine_distances (#393)
move model optional args to kwargs (#381)
Features
Bug Fixes
Avoid ibis warning for “database” table() method argument (#390) (a0490a4)
Rename cosine_similarity to paired_cosine_distances (#393) (81ece46)
Performance Improvements
Dependencies
Documentation
Miscellaneous Chores
Code Refactoring
0.21.0 (2024-02-13)
Features
Add ml.metrics.pairwise.cosine_similarity function (#374) (126f566)
Support bigframes.pandas.to_datetime for scalars, iterables and series. (#372) (ffb0d15)
Bug Fixes
Documentation
0.20.1 (2024-02-06)
Performance Improvements
Documentation
0.20.0 (2024-01-30)
Features
Add
DataFrame.peek()
as an efficient alternative tohead()
results preview (#318) (9c34d83)Add ARIMA_EVAULATE options in forecasting models (#336) (73e997b)
Add Index constructor, repr, copy, get_level_values, to_series (#334) (e5d054e)
Improve error message for drive based BQ table reads (#344) (0794788)
Update cut to work without labels = False and show intervals as dict (#335) (4ff53db)
Bug Fixes
Chance default connection name in getting_started.ipnyb (#347) (677f014)
Series iteration correctly returns values instead of index (#339) (2c6af9b)
Documentation
0.19.2 (2024-01-22)
Bug Fixes
Documentation
0.19.1 (2024-01-17)
Bug Fixes
Documentation
0.19.0 (2024-01-09)
Features
Allow manually set clustering_columns in dataframe.to_gbq (#302) (9c21323)
Support assigning to columns like a property (#304) (f645c56)
Support upcasting numeric columns in concat (#294) (e3a056a)
Bug Fixes
Documentation
0.18.0 (2024-01-02)
Features
Add IntervalIndex support to bigframes.pandas.cut (#254) (6c1969a)
Specific pyarrow mappings for decimal, bytes types (#283) (a1c0631)
Bug Fixes
Dataframes to_gbq now creates dataset if it doesn’t exist (#222) (bac62f7)
Exclude pandas 2.2.0rc0 to unblock prerelease tests (#292) (ac1a745)
Fix DataFrameGroupby.agg() issue with as_index=False (#273) (ab49350)
Make
Series.str.replace
work for simple strings (#285) (ad67465)Update dataframe.to_gbq to dedup column names. (#286) (746115d)
Dependencies
Documentation
Add code snippets for explore query result page (#278) (7cbbb7d)
Code samples for
astype
common to DataFrame and Series (#280) (95b673a)Code samples for
DataFrame.copy
andSeries.copy
(#290) (7cbc2b0)Code samples for
isna
,isnull
,dropna
,isin
(#289) (ad51035)Code samples for
reset_index
andsort_values
(#282) (acc0eb7)Code samples for
Series.{add, replace, unique, T, transpose}
(#287) (0e1bbfc)Code samples for
Series.{map, to_list, count}
(#290) (7cbc2b0)Code samples for
Series.groupby
andSeries.{sum,mean,min,max}
(#280) (95b673a)Code samples for DataFrame
set_index
,items
(#295) (c2b1892)
0.17.0 (2023-12-14)
Features
Bug Fixes
Increase recursion limit, cache compilation tree hashes (#184) (b54791c)
Replaced raise
NotImplementedError
with returnNotImplemented
(#258) (a133822)
Documentation
0.16.0 (2023-12-12)
Features
Add DataFrame from_dict and from_records methods (#244) (8d81e24)
Add nunique method to Series/DataFrameGroupby (#256) (c8ec245)
Support dataframe.loc with conditional columns selection (#233) (3febea9)
Bug Fixes
Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests (b02fc2c)
Fix value_counts column label for normalize=True (#245) (d3fa6f2)
Migrate e2e tests to bigframes-load-testing project (8766ac6)
Documentation
Add example for dataframe.melt, dataframe.pivot, dataframe.stac… (#252) (8c63697)
Add example to dataframe.nlargest, dataframe.nsmallest, datafra… (#234) (e735412)
Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod (#243) (0523a31)
Add examples for dataframe.nunique, dataframe.diff, dataframe.a… (#251) (77074ec)
Correct the params rendering for
ml.remote
andml.ensemble
modules (#248) (c2829e3)
0.15.0 (2023-11-29)
⚠ BREAKING CHANGES
- model.predict returns all the columns (#204)
Features
Add info and memory_usage methods to dataframe (#219) (9d6613d)
Send warnings on LLM prediction partial failures (#216) (81125f9)
Bug Fixes
Avoid unnecessary row_number() on sort key for io (#211) (a18d40e)
Make to_pandas override enable_downsampling when sampling_method is manually set. (#200) (ae03756)
Update the llm+kmeans notebook with recent change (#236) (f8917ab)
Use anonymous dataset to create
remote_function
(#205) (69b016e)
Documentation
Add code samples for
index
andcolumn
properties (#212) (c88d38e)Add code samples for df reshaping, function, merge, and join methods (#203) (010486c)
Add examples for dataframe.kurt, dataframe.std, dataframe.count (#232) (f9c6e72)
Add examples for dataframe.mean, dataframe.median, dataframe.va… (#228) (edd0522)
Add examples for dataframe.min, dataframe.max and dataframe.sum (#227) (3a375e8)
Code samples for
Series.dot
andDataFrame.dot
(#226) (b62a07a)Code samples for
Series.where
andSeries.mask
(#217) (52dfad2)Code samples for dataframe.any, dataframe.all and dataframe.prod (#223) (d7957fa)
Make the code samples reflect default bq connection usage (#206) (71844b0)
Miscellaneous Chores
0.14.1 (2023-11-16)
Bug Fixes
Documentation
0.14.0 (2023-11-14)
Features
Add ‘index’, ‘pad’, ‘nearest’ interpolate methods (#162) (6a28403)
Add series.sample (identical to existing dataframe.sample) (#187) (37914a4)
Log most recent API calls as
recent-bigframes-api-xx
labels on BigQuery jobs (#145) (4ea33b7)Read_gbq creates order deterministically without table copy (#191) (8ab81de)
Support
date_series.astype("string[pyarrow]")
to cast DATE to STRING (#186) (aee0e8e)Temporary resources no longer use BigQuery Sessions (#194) (4a02cac)
Bug Fixes
Default to 7 days expiration for
read_csv
,read_json
,read_parquet
(#193) (03606cd)Deprecate the
remote_service_type
in llm model (#180) (a8a409a)For reset_index on unnamed multiindex, always use level_[n] label (#182) (f95000d)
Match pandas behavior when assigning listlike to empty dfs (#172) (c1d1f42)
Use anonymous dataset instead of session dataset for temp tables (#181) (800d44e)
Use random table when loading data for
read_csv
,read_json
,read_parquet
(#175) (9d2e6dc)
Documentation
Add code samples for
read_gbq_function
using community UDFs (#188) (7506eab)Add docstring code samples for
Series.apply
andDataFrame.map
(#185) (c816d84)Add llm kmeans notebook as an included example (#177) (d49ae42)
Use
head()
to get topn
results, not to preview results (#190) (87f84c9)
0.13.0 (2023-11-07)
Features
to_gbq
without a destination table writes to a temporary table (#158) (e1817c9)Add
DataFrame.__iter__
,DataFrame.iterrows
,DataFrame.itertuples
, andDataFrame.keys
methods (#164) (c065071)Support 32k text-generation and multilingual embedding models (#161) (5f0ea37)
Bug Fixes
0.12.0 (2023-11-01)
Features
Add
DataFrame.to_pandas_batches()
to download largeDataFrame
objects (#136) (3afd4a3)Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs (#133) (63c7919)
Bug Fixes
Fix bug with column names under repeated column assignment (#150) (29032d0)
Resolve plotly rendering issue by using ipython html for job pro… (#134) (39df43e)
Use indexee’s session for loc listlike cases (#152) (27c5725)
Documentation
Fix indentation on
read_gbq_function
code sample (#163) (0801d96)Link to ML.EVALUATE BQML page for score() methods (#137) (45c617f)
0.11.0 (2023-10-26)
Features
Add back
reset_session
as an alias forclose_session
(#124) (694a85a)Change
query
parameter toquery_or_table
inread_gbq
(#127) (f9bb3c4)
Bug Fixes
Expose
bigframes.pandas.reset_session
as a public API (#128) (b17e1f4)Use series’s own session in series.reindex listlike case (#135) (95bff3f)
Documentation
Add runnable code samples for DataFrames I/O methods and property (#129) (6fea8ef)
Add runnable code samples for reading methods (#125) (a669919)
0.10.0 (2023-10-19)
Features
0.9.0 (2023-10-18)
⚠ BREAKING CHANGES
- rename
bigframes.pandas.reset_session
toclose_session
(#101)
Features
Add
bigframes.options.bigquery.application_name
for partner attribution (#117) (52d64ff)Rename
bigframes.pandas.reset_session
toclose_session
(#101) (36693bf)Send BigQuery cancel request when canceling bigframes process (#103) (e325fbb)
Support external packages in
remote_function
(#98) (ec10c4a)Use ArrowDtype for STRUCT columns in
to_pandas
(#85) (9238fad)
Bug Fixes
Performance Improvements
Documentation
0.8.0 (2023-10-12)
⚠ BREAKING CHANGES
- The default behavior of
to_parquet
is changing from no compression to'snappy'
compression.
Features
- Support compression in
to_parquet
(a8c286f)
Bug Fixes
0.7.0 (2023-10-11)
Features
Bug Fixes
Documentation
0.6.0 (2023-10-04)
Features
Bug Fixes
0.5.0 (2023-09-28)
Features
Add
DataFrame.kurtosis
/DF.kurt
method (c1900c2)Add
DataFrame.rolling
andDataFrame.expanding
methods (c1900c2)Add index
dtype
,astype
,drop
,fillna
, aggregate attributes. (#38) (1a254a4)Support
calculate_p_values
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
class_weights="balanced"
inLogisticRegression
model (c1900c2)Support
df[column_name] = df_only_one_column
(c1900c2)Support
early_stop
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
enable_global_explain
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
l2_reg
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
learn_rate_strategy
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
ls_init_learn_rate
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
max_iterations
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
min_rel_progress
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)Support
optimize_strategy
parameter inbigframes.ml.linear_model.LinearRegression
(c1900c2)
Bug Fixes
Generate unique ids on join to avoid id collisions (#65) (7ab65e8)
Loosen filter items tests to accomodate shifting pandas impl (#41) (edabdbb)
Performance Improvements
Add ability to cache dataframe and series to session table (#51) (416d7cb)
Inline small
Series
andDataFrames
in query text (#45) (5e199ec)Reimplement unpivot to use cross join rather than union (#47) (f9a93ce)
Simplify join order to use multiple order keys instead of string. (#36) (5056da6)
Documentation
- Link to Remote Functions code samples from README and API reference (c1900c2)
0.4.0 (2023-09-16)
Features
Add
axis
parameter todroplevel
andreorder_levels
(7c6b0dd)Add
bfill
andffill
toDataFrame
andSeries
(7c6b0dd)Add
DataFrame.combine
andDataFrame.combine_first
(#27) (7c6b0dd)Add
DataFrame.nlargest
,nsmallest
(7c6b0dd)Add
DataFrame.pct_change
andSeries.pct_change
(7c6b0dd)Add
DataFrame.skew
andGroupBy.skew
(7c6b0dd)Add
DataFrame.to_dict
,to_excel
,to_latex
,to_records
,to_string
,to_markdown
,to_pickle
,to_orc
(7c6b0dd)Add
diff
method toDataFrame
andGroupBy
(7c6b0dd)Add
filter
andreindex
toSeries
andDataFrame
(7c6b0dd)Add
reindex_like
toDataFrame
andSeries
(7c6b0dd)Add
swaplevel
toDataFrame
andSeries
(7c6b0dd)Add partial support for
Sereies.replace
(7c6b0dd)Support
DataFrame.loc[bool_series, column] = scalar
(7c6b0dd)Support a persistent
name
inremote_function
(7c6b0dd)
Bug Fixes
remote_function
uses same credentials as other APIs (7c6b0dd)Add type hints to models (7c6b0dd)
Raise error when ARIMAPlus is used with Pipeline (7c6b0dd)
Remove
transforms
parameter inmodel.fit
(breaking change) (7c6b0dd)Support column joins with “None indexer” (7c6b0dd)
Use for literals
Int64Dtype
incut
(7c6b0dd)Use lowercase strings for parameter literals in
bigframes.ml
(breaking change) (7c6b0dd)
Performance Improvements
bigframes-api
label to I/O query jobs (7c6b0dd)
Documentation
Document possible parameter values for PaLM2TextGenerator (7c6b0dd)
Document region logic in README (7c6b0dd)
Fix OneHotEncoder sample (7c6b0dd)
0.3.2 (2023-09-06)
Bug Fixes
0.3.1 (2023-09-05)
Bug Fixes
0.3.0 (2023-09-02)
Features
Add
bigframes.get_global_session()
andbigframes.reset_session()
aliases (a32b747)Add
bigframes.pandas.read_pickle
function (a32b747)Add
components_
,explained_variance_
, andexplained_variance_ratio_
properties tobigframes.ml.decomposition.PCA
(89b9503)Add
fit_transform
tobigquery.ml
transformers (a32b747)Add
Series.dropna
andDataFrame.fillna
(8fab755)Add
Series.str
methodsisalpha
,isdigit
,isdecimal
,isalnum
,isspace
,islower
,isupper
,zfill
,center
(a32b747)Support
bigframes.pandas.merge()
(8fab755)Support
DataFrame.isin
with list and dict inputs (8fab755)Support
DataFrame.pivot
(a32b747)Support
DataFrame.stack
(89b9503)Support
DataFrame
-DataFrame
binary operations (8fab755)Support
df[my_column] = [a python list]
(89b9503)Support
Index.is_monotonic
(8fab755)Support
np.arcsin
,np.arccos
,np.arctan
,np.sinh
,np.cosh
,np.tanh
,np.arcsinh
,np.arccosh
,np.arctanh
,np.exp
with Series argument (89b9503)Support
np.sin
,np.cos
,np.tan
,np.log
,np.log10
,np.sqrt
,np.abs
with Series argument (89b9503)Support
pow()
and power operator inDataFrame
andSeries
(8fab755)Support
read_json
withengine=bigquery
for newline-delimited JSON files (89b9503)Support
Series.corr
(89b9503)Support
Series.map
(8fab755)Support for
np.add
,np.subtract
,np.multiply
,np.divide
,np.power
(8fab755)Support MultiIndex for DataFrame columns (a32b747)
Use
pandas.Index
for column labels (a32b747)Use default session and connection in
ml.llm
andml.imported
(8fab755)
Bug Fixes
Add error message to
set_index
(a32b747)Align column names with pandas in
DataFrame.agg
results (89b9503)Allow (but still not recommended)
ORDER BY
inread_gbq
input when anindex_col
is defined (89b9503)Check for IAM role on the BigQuery connection when initializing a
remote_function
(89b9503)Check that types are specified in
read_gbq_function
(a32b747)Don’t use query cache for Session construction (a32b747)
Include survey link in abstract
NotImplementedError
exception messages (89b9503)Label temp table creation jobs with
source=bigquery-dataframes-temp
label (89b9503)Make
X_train
argument names consistent across methods (8fab755)Raise AttributeError for unimplemented pandas methods (89b9503)
Raise exception for invalid function in
read_gbq_function
(a32b747)Support spaces in column names in
DataFrame
initializater (89b9503)
Performance Improvements
Add local cache for
__repr_\*__
methods (a32b747)Lazily instantiate client library objects (89b9503)
Use
row_number()
filter forhead
/tail
(8fab755)
Documentation
Add ML section under Overview (a32b747)
Add release status to table of contents (a32b747)
Add samples and best practices to
read_gbq
docs (a32b747)Correct the return types of Dataframe and Series (a32b747)
Create subfolders for notebooks (a32b747)
Fix link to GitHub (89b9503)
Highlight bigframes is open-source (a32b747)
Sample ML Drug Name Generation notebook (a32b747)
Set
options.bigquery.project
in sample code (89b9503)Transform remote function user guide into sample code (a32b747)
Update remote function notebook with read_gbq_function usage (8fab755)
0.2.0 (2023-08-17)
Features
Add KMeans.cluster_centers_.
Allow column labels to be any type handled by bq df, column labels can be integers now.
Add dataframegroupby.agg().
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
Add match, fullmatch, get, pad str methods.
Add series isin function.
Bug Fixes
Update ML package to use sessions for queries.
Optimize
read_gbq
withindex_col
set to cluster byindex_col
.Raise ValueError if the location mismatched.
read_gbq
no longer uses ‘time travel’ with query inputs.
Documentation
- Add docstring to _uniform_sampling to avoid user using it.
0.1.1 (2023-08-14)
Documentation
- Correct link to code repository in
setup.py
and use correct terminology forconsole.cloud.google.com
links.
0.1.0 (2023-08-11)
Features
Add
bigframes.pandas
package with an API compatible with pandas. Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.Add
bigframes.ml
package with an API inspired by scikit-learn. Train machine learning models and run batch predicition, powered by BigQuery ML.
0.0.0 (2023-02-22)
- Empty package to reserve package name.