BigQuery DataFrames는 익숙한 Python API를 사용하여 BigQuery 데이터 처리를 활용할 수 있는 오픈소스 Python 라이브러리 집합입니다. BigQuery DataFrames는 BigQuery 엔진을 기반으로 하는 Pythonic DataFrame을 제공하며, SQL 변환을 통해 처리를 BigQuery로 내보내 Pandas 및 scikit-learn API를 구현합니다. 이를 통해 BigQuery를 사용하여 Python API로 테라바이트 단위의 데이터를 탐색 및 처리하고 머신러닝(ML) 모델을 학습시킬 수도 있습니다.
다음 다이어그램은 BigQuery DataFrames의 워크플로를 설명합니다.
BigQuery DataFrames의 이점
BigQuery DataFrames는 다음 작업을 실행합니다.
BigQuery 및 BigQuery ML API로의 투명한 SQL 변환을 통해 구현되는 750개 이상의 Pandas 및 scikit-learn API를 제공합니다.
성능 향상을 위해 쿼리 실행을 연기합니다.
사용자 정의 Python 함수로 데이터 변환을 확장하여 Google Cloud에서 데이터를 처리할 수 있습니다. 이러한 함수는 BigQuery 원격 함수로 자동 배포됩니다.
BigQuery 할당량이 하드웨어, 소프트웨어, 네트워크 구성요소를 비롯한 BigQuery DataFrames에 적용됩니다.
Pandas 및 scikit-learn API의 하위 집합이 지원됩니다. 자세한 내용은 지원되는 Pandas API를 참조하세요.
세션 정리 과정에서 자동으로 생성된 Cloud Run Functions 함수를 명시적으로 삭제해야 합니다. 자세한 내용은 지원되는 Pandas API를 참조하세요.
가격 책정
BigQuery DataFrames는 추가 비용 없이 다운로드할 수 있는 오픈소스 Python 라이브러리 집합입니다.
BigQuery DataFrames는 자체 비용이 발생하는 BigQuery, Cloud Run Functions, Vertex AI, 기타Google Cloud 서비스를 사용합니다.
일반적인 사용 중에 BigQuery DataFrames는 중간 결과와 같은 임시 데이터를 BigQuery 테이블에 저장합니다. 이러한 테이블은 기본적으로 7일 동안 유지되며 테이블에 저장된 데이터에 대한 요금이 부과됩니다. 테이블은 bf.options.bigquery.project 옵션에서 지정한 Google Cloud 프로젝트의 _anonymous_ 데이터 세트에 생성됩니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[[["\u003cp\u003eBigQuery DataFrames are open-source Python libraries that enable users to leverage BigQuery's data processing power through familiar Python APIs.\u003c/p\u003e\n"],["\u003cp\u003eIt offers over 750 implemented pandas and scikit-learn APIs by converting them transparently into SQL for BigQuery and BigQuery ML API processing.\u003c/p\u003e\n"],["\u003cp\u003eBigQuery DataFrames enhances performance by deferring query execution and allowing user-defined Python functions for data transformation, which are automatically deployed as BigQuery remote functions.\u003c/p\u003e\n"],["\u003cp\u003eThe libraries integrate with Vertex AI for text generation with Gemini models, alongside other external packages like Ibis, pandas, and scikit-learn, and is distributed under the Apache-2.0 license.\u003c/p\u003e\n"],["\u003cp\u003eUsers should be aware of BigQuery quotas, the subset of supported pandas and scikit-learn APIs, and that the usage of BigQuery, Cloud Run functions, and Vertex AI may incur additional costs.\u003c/p\u003e\n"]]],[],null,["# Introduction to BigQuery DataFrames\n===================================\n\nBigQuery DataFrames is a set of open source Python libraries that let\nyou take advantage of BigQuery data processing by using familiar\nPython APIs. BigQuery DataFrames provides a Pythonic DataFrame powered\nby the BigQuery engine, and it implements the pandas and\nscikit-learn APIs by pushing the processing down to BigQuery\nthrough SQL conversion. This lets you use BigQuery to explore\nand process terabytes of data, and also train machine learning (ML) models,\nall with Python APIs.\n\nThe following diagram describes the workflow of BigQuery DataFrames:\n\n| **Note:** There are breaking changes to some default parameters in BigQuery DataFrames version 2.0. To learn about these changes and how to migrate to version 2.0, see [Migrate to BigQuery DataFrames\n| 2.0](/bigquery/docs/use-bigquery-dataframes#version-2).\n\nBigQuery DataFrames benefits\n----------------------------\n\nBigQuery DataFrames does the following:\n\n- Offers more than 750 pandas and scikit-learn APIs implemented through transparent SQL conversion to BigQuery and BigQuery ML APIs.\n- Defers the execution of queries for enhanced performance.\n- Extends data transformations with user-defined Python functions to let you process data in Google Cloud. These functions are automatically deployed as BigQuery [remote functions](/bigquery/docs/remote-functions).\n- Integrates with Vertex AI to let you use Gemini models for text generation.\n\nLicensing\n---------\n\nBigQuery DataFrames is distributed with the\n[Apache-2.0 license](https://github.com/googleapis/python-bigquery-dataframes/blob/main/LICENSE).\n\nBigQuery DataFrames also contains code derived from the following\nthird-party packages:\n\n- [Ibis](https://ibis-project.org/)\n- [pandas](https://pandas.pydata.org/)\n- [Python](https://www.python.org/)\n- [scikit-learn](https://scikit-learn.org/)\n- [XGBoost](https://xgboost.readthedocs.io/en/stable/)\n\nFor details, see the\n[`third_party/bigframes_vendored`](https://github.com/googleapis/python-bigquery-dataframes/tree/main/third_party/bigframes_vendored)\ndirectory in the BigQuery DataFrames GitHub repository.\n\nQuotas and limits\n-----------------\n\n- [BigQuery quotas](/bigquery/quotas) apply to BigQuery DataFrames, including hardware, software, and network components.\n- A subset of pandas and scikit-learn APIs are supported. For more information, see [Supported pandas APIs](/python/docs/reference/bigframes/latest/supported_pandas_apis).\n- You must explicitly clean up any automatically created Cloud Run functions functions as part of session cleanup. For more information, see [Supported pandas APIs](/python/docs/reference/bigframes/latest/supported_pandas_apis).\n\nPricing\n-------\n\n- BigQuery DataFrames is a set of open source Python libraries available for download at no extra cost.\n- BigQuery DataFrames uses BigQuery, Cloud Run functions, Vertex AI, and other Google Cloud services, which incur their own costs.\n- During regular usage, BigQuery DataFrames stores temporary data, such as intermediate results, in BigQuery tables. These tables persist for seven days by default, and you are charged for the data stored in them. The tables are created in the `_anonymous_` dataset in the Google Cloud project you specify in the [`bf.options.bigquery.project` option](/python/docs/reference/bigframes/latest/bigframes._config.bigquery_options.BigQueryOptions).\n\nWhat's next\n-----------\n\n- Try the [BigQuery DataFrames quickstart](/bigquery/docs/dataframes-quickstart).\n- Learn how to [use BigQuery DataFrames](/bigquery/docs/use-bigquery-dataframes).\n- Learn how to [visualize graphs using BigQuery DataFrames](/bigquery/docs/dataframes-visualizations).\n- Learn how to [use the `dbt-bigquery` adapter](/bigquery/docs/dataframes-dbt)."]]