English
Deutsch
Español – América Latina
Français
Indonesia
Italiano
Português – Brasil
中文 – 简体
中文 – 繁體
日本語
한국어

控制台

聯絡我們免費試用

本頁面由 Cloud Translation API 翻譯而成。

試用 BigQuery DataFrames

在本快速入門導覽課程中，您將在 BigQuery 筆記本中使用 BigQuery DataFrames API，執行下列分析和機器學習 (ML) 工作：

在公開資料集上建立 DataFrame。bigquery-public-data.ml_datasets.penguins
計算企鵝的平均體重。
建立線性迴歸模型。
在企鵝資料的子集上建立 DataFrame，做為訓練資料。
清理訓練資料。
設定模型參數。
調整模型。
評估模型。

事前準備

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

確認已啟用 BigQuery API。

啟用 API

如果您建立新專案，系統會自動啟用 BigQuery API。

所需權限

如要建立及執行 Notebook，您需要下列 Identity and Access Management (IAM) 角色：

建立筆記本

按照「從 BigQuery 編輯器建立筆記本」一文中的操作說明，建立新的筆記本。

試用 BigQuery DataFrames

如要試用 BigQuery DataFrames，請按照下列步驟操作：

在筆記本中建立新的程式碼儲存格。

複製下列程式碼，並貼到程式碼儲存格中：

import bigframes.pandas as bpd

# Set BigQuery DataFrames options
# Note: The project option is not required in all environments.
# On BigQuery Studio, the project ID is automatically detected.
bpd.options.bigquery.project = your_gcp_project_id

# Use "partial" ordering mode to generate more efficient queries, but the
# order of the rows in DataFrames may not be deterministic if you have not
# explictly sorted it. Some operations that depend on the order, such as
# head() will not function until you explictly order the DataFrame. Set the
# ordering mode to "strict" (default) for more pandas compatibility.
bpd.options.bigquery.ordering_mode = "partial"

# Create a DataFrame from a BigQuery table
query_or_table = "bigquery-public-data.ml_datasets.penguins"
df = bpd.read_gbq(query_or_table)

# Efficiently preview the results using the .peek() method.
df.peek()

# Use the DataFrame just as you would a pandas DataFrame, but calculations
# happen in the BigQuery query engine instead of the local system.
average_body_mass = df["body_mass_g"].mean()
print(f"average_body_mass: {average_body_mass}")

# Create the Linear Regression model
from bigframes.ml.linear_model import LinearRegression

# Filter down to the data we want to analyze
adelie_data = df[df.species == "Adelie Penguin (Pygoscelis adeliae)"]

# Drop the columns we don't care about
adelie_data = adelie_data.drop(columns=["species"])

# Drop rows with nulls to get our training data
training_data = adelie_data.dropna()

# Pick feature columns and label column
X = training_data[
    [
        "island",
        "culmen_length_mm",
        "culmen_depth_mm",
        "flipper_length_mm",
        "sex",
    ]
]
y = training_data[["body_mass_g"]]

model = LinearRegression(fit_intercept=False)
model.fit(X, y)
model.score(X, y)

修改 bpd.options.bigquery.project = your_gcp_project_id 行，指定專案，例如 bpd.options.bigquery.project = "myproject"。
執行程式碼儲存格。

程式碼儲存格會傳回資料集中企鵝的平均體重，然後傳回模型的評估指標。

清除所用資源

如要避免付費，最簡單的方法就是刪除您為了本教學課程所建立的專案。

如要刪除專案：

In the Google Cloud console, go to the Manage resources page.
Go to Manage resources
In the project list, select the project that you want to delete, and then click Delete.
In the dialog, type the project ID, and then click Shut down to delete the project.

後續步驟

除非另有註明，否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權，程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。

上次更新時間：2025-07-31 (世界標準時間)。