Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Pengantar BigQuery DataFrames
BigQuery DataFrames adalah sekumpulan library Python open source yang memungkinkan Anda memanfaatkan pemrosesan data BigQuery menggunakan API Python yang sudah dikenal. BigQuery DataFrames menyediakan DataFrame Pythonik yang didukung oleh mesin BigQuery, dan mengimplementasikan API pandas dan scikit-learn dengan mendorong pemrosesan ke BigQuery melalui konversi SQL. Dengan begitu, Anda dapat menggunakan BigQuery untuk menjelajahi dan memproses data berukuran terabyte, serta melatih model machine learning (ML), semuanya dengan Python API.
Diagram berikut menjelaskan alur kerja BigQuery DataFrames:
Manfaat BigQuery DataFrames
BigQuery DataFrames melakukan hal berikut:
Menawarkan lebih dari 750 API pandas dan scikit-learn yang diimplementasikan melalui konversi SQL transparan ke BigQuery dan BigQuery ML API.
Menunda eksekusi kueri untuk meningkatkan performa.
Memperluas transformasi data dengan fungsi Python yang ditentukan pengguna untuk memungkinkan Anda memproses data di Google Cloud. Fungsi ini otomatis di-deploy sebagai fungsi jarak jauh BigQuery.
Terintegrasi dengan Vertex AI untuk memungkinkan Anda menggunakan model Gemini
untuk pembuatan teks.
Kouta BigQuery berlaku untuk
BigQuery DataFrames, termasuk komponen hardware, software, dan jaringan.
Sebagian kecil API pandas dan scikit-learn didukung. Untuk mengetahui informasi selengkapnya, lihat API pandas yang didukung.
Anda harus menghapus fungsi Cloud Run yang dibuat secara otomatis secara eksplisit sebagai bagian dari pembersihan sesi. Untuk mengetahui informasi selengkapnya, lihat
API pandas yang didukung.
Harga
BigQuery DataFrames adalah sekumpulan library Python open source yang tersedia untuk didownload tanpa biaya tambahan.
BigQuery DataFrames menggunakan BigQuery, Cloud Run Functions, Vertex AI, dan layananGoogle Cloud lainnya, yang menimbulkan biaya tersendiri.
Selama penggunaan reguler, BigQuery DataFrames menyimpan data sementara,
seperti hasil perantara, dalam tabel BigQuery. Tabel ini dipertahankan selama tujuh hari secara default, dan Anda akan ditagih untuk data yang disimpan di dalamnya. Tabel dibuat di set data _anonymous_
dalam project Google Cloud yang Anda tentukan di
opsi bf.options.bigquery.project.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[[["\u003cp\u003eBigQuery DataFrames are open-source Python libraries that enable users to leverage BigQuery's data processing power through familiar Python APIs.\u003c/p\u003e\n"],["\u003cp\u003eIt offers over 750 implemented pandas and scikit-learn APIs by converting them transparently into SQL for BigQuery and BigQuery ML API processing.\u003c/p\u003e\n"],["\u003cp\u003eBigQuery DataFrames enhances performance by deferring query execution and allowing user-defined Python functions for data transformation, which are automatically deployed as BigQuery remote functions.\u003c/p\u003e\n"],["\u003cp\u003eThe libraries integrate with Vertex AI for text generation with Gemini models, alongside other external packages like Ibis, pandas, and scikit-learn, and is distributed under the Apache-2.0 license.\u003c/p\u003e\n"],["\u003cp\u003eUsers should be aware of BigQuery quotas, the subset of supported pandas and scikit-learn APIs, and that the usage of BigQuery, Cloud Run functions, and Vertex AI may incur additional costs.\u003c/p\u003e\n"]]],[],null,["# Introduction to BigQuery DataFrames\n===================================\n\nBigQuery DataFrames is a set of open source Python libraries that let\nyou take advantage of BigQuery data processing by using familiar\nPython APIs. BigQuery DataFrames provides a Pythonic DataFrame powered\nby the BigQuery engine, and it implements the pandas and\nscikit-learn APIs by pushing the processing down to BigQuery\nthrough SQL conversion. This lets you use BigQuery to explore\nand process terabytes of data, and also train machine learning (ML) models,\nall with Python APIs.\n\nThe following diagram describes the workflow of BigQuery DataFrames:\n\n| **Note:** There are breaking changes to some default parameters in BigQuery DataFrames version 2.0. To learn about these changes and how to migrate to version 2.0, see [Migrate to BigQuery DataFrames\n| 2.0](/bigquery/docs/use-bigquery-dataframes#version-2).\n\nBigQuery DataFrames benefits\n----------------------------\n\nBigQuery DataFrames does the following:\n\n- Offers more than 750 pandas and scikit-learn APIs implemented through transparent SQL conversion to BigQuery and BigQuery ML APIs.\n- Defers the execution of queries for enhanced performance.\n- Extends data transformations with user-defined Python functions to let you process data in Google Cloud. These functions are automatically deployed as BigQuery [remote functions](/bigquery/docs/remote-functions).\n- Integrates with Vertex AI to let you use Gemini models for text generation.\n\nLicensing\n---------\n\nBigQuery DataFrames is distributed with the\n[Apache-2.0 license](https://github.com/googleapis/python-bigquery-dataframes/blob/main/LICENSE).\n\nBigQuery DataFrames also contains code derived from the following\nthird-party packages:\n\n- [Ibis](https://ibis-project.org/)\n- [pandas](https://pandas.pydata.org/)\n- [Python](https://www.python.org/)\n- [scikit-learn](https://scikit-learn.org/)\n- [XGBoost](https://xgboost.readthedocs.io/en/stable/)\n\nFor details, see the\n[`third_party/bigframes_vendored`](https://github.com/googleapis/python-bigquery-dataframes/tree/main/third_party/bigframes_vendored)\ndirectory in the BigQuery DataFrames GitHub repository.\n\nQuotas and limits\n-----------------\n\n- [BigQuery quotas](/bigquery/quotas) apply to BigQuery DataFrames, including hardware, software, and network components.\n- A subset of pandas and scikit-learn APIs are supported. For more information, see [Supported pandas APIs](/python/docs/reference/bigframes/latest/supported_pandas_apis).\n- You must explicitly clean up any automatically created Cloud Run functions functions as part of session cleanup. For more information, see [Supported pandas APIs](/python/docs/reference/bigframes/latest/supported_pandas_apis).\n\nPricing\n-------\n\n- BigQuery DataFrames is a set of open source Python libraries available for download at no extra cost.\n- BigQuery DataFrames uses BigQuery, Cloud Run functions, Vertex AI, and other Google Cloud services, which incur their own costs.\n- During regular usage, BigQuery DataFrames stores temporary data, such as intermediate results, in BigQuery tables. These tables persist for seven days by default, and you are charged for the data stored in them. The tables are created in the `_anonymous_` dataset in the Google Cloud project you specify in the [`bf.options.bigquery.project` option](/python/docs/reference/bigframes/latest/bigframes._config.bigquery_options.BigQueryOptions).\n\nWhat's next\n-----------\n\n- Try the [BigQuery DataFrames quickstart](/bigquery/docs/dataframes-quickstart).\n- Learn how to [use BigQuery DataFrames](/bigquery/docs/use-bigquery-dataframes).\n- Learn how to [visualize graphs using BigQuery DataFrames](/bigquery/docs/dataframes-visualizations).\n- Learn how to [use the `dbt-bigquery` adapter](/bigquery/docs/dataframes-dbt)."]]