快速入门:使用客户端库

本页向您介绍如何使用 Google Cloud 客户端库以您最喜爱的编程语言开始使用 Google BigQuery API。

准备工作

  1. 登录您的 Google 帐号。

    如果您还没有 Google 帐号,请注册新帐号

  2. 选择或创建 Google Cloud Platform 项目。

    转到“管理资源”页面

  3. 启用Google BigQuery API。

    启用 API

  4. 设置身份验证:
    1. 在 GCP Console 中,转到创建服务帐号密钥页面。

      转到“创建服务帐号密钥”页面
    2. 服务帐号列表中,选择新的服务帐号
    3. 服务帐号名称字段中,输入一个名称。
    4. 角色列表中,选择项目 > 所有者

      注意角色字段为您的服务帐号授予资源访问权限。稍后您可以使用 GCP Console 查看和更改此字段。如果您开发的是正式版应用,请指定比项目 > 所有者更为精细的权限。如需了解详情,请参阅为服务帐号授予角色
    5. 点击创建。包含密钥的 JSON 文件就会下载到计算机。
  5. 将环境变量 GOOGLE_APPLICATION_CREDENTIALS 设置为包含服务帐号密钥的 JSON 文件的文件路径。此变量仅适用于当前的 shell 会话,因此,如果您打开新的会话,请重新设置该变量。

安装客户端库

C#

要详细了解如何设置 C# 开发环境,请参阅 C# 开发环境设置指南
Install-Package Google.Cloud.BigQuery.V2 -Pre

Go

go get -u cloud.google.com/go/bigquery

Java

要详细了解如何设置 Java 开发环境,请参阅 Java 开发环境设置指南。 如果您使用的是 Maven,请将以下代码添加到您的 pom.xml 文件中:
<dependency>
  <groupId>com.google.cloud</groupId>
  <artifactId>google-cloud-bigquery</artifactId>
  <version>1.75.0</version>
</dependency>
如果您使用的是 Gradle,请将以下代码添加到您的依赖项中:
compile 'com.google.cloud:google-cloud-bigquery:1.75.0'
如果您使用的是 SBT,请将以下代码添加到您的依赖项中:
libraryDependencies += "com.google.cloud" % "google-cloud-bigquery" % "1.75.0"

如果您使用的是 IntelliJ 或 Eclipse,请通过以下 IDE 插件将客户端库添加到您的项目中:

上述插件还提供其他功能,例如服务帐号密钥管理。如需了解详情,请参阅各个插件相应的文档。

Node.js

要详细了解如何设置 Node.js 开发环境,请参阅 Node.js 开发环境设置指南
npm install --save @google-cloud/bigquery

PHP

composer require google/cloud-bigquery

Python

要详细了解如何设置 Python 开发环境,请参阅 Python 开发环境设置指南
pip install --upgrade google-cloud-bigquery

Ruby

如需详细了解如何设置 Ruby 开发环境,请参阅 Ruby 开发环境设置指南
gem install google-cloud-bigquery

导入库

C#

如需了解详情,请参阅 BigQuery C# API 参考文档


using System;
using Google.Cloud.BigQuery.V2;

Go

如需了解详情,请参阅 BigQuery Go API 参考文档

import (
	"context"
	"fmt"
	"io"
	"log"
	"os"

	"cloud.google.com/go/bigquery"
	"google.golang.org/api/iterator"
)

Java

如需了解详情,请参阅 BigQuery Java API 参考文档


import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.FieldValueList;
import com.google.cloud.bigquery.Job;
import com.google.cloud.bigquery.JobId;
import com.google.cloud.bigquery.JobInfo;
import com.google.cloud.bigquery.QueryJobConfiguration;
import com.google.cloud.bigquery.QueryResponse;
import com.google.cloud.bigquery.TableResult;
import java.util.UUID;

Node.js

如需了解详情,请参阅 BigQuery Node.js API 参考文档

// Import the Google Cloud client library
const {BigQuery} = require('@google-cloud/bigquery');

PHP

如需了解详情,请参阅 BigQuery PHP API 参考文档

use Google\Cloud\BigQuery\BigQueryClient;

Python

如需了解详情,请参阅 BigQuery Python API 参考文档

from google.cloud import bigquery

Ruby

如需了解详情,请参阅 BigQuery Ruby API 参考文档

require "google/cloud/bigquery"

初始化 BigQuery 客户端

C#

使用 BigQueryClient.Create() 函数创建 BigQuery 客户端。

string projectId = "YOUR-PROJECT-ID";
var client = BigQueryClient.Create(projectId);

Go

使用 bigquery.NewClient() 函数创建 BigQuery 客户端。

ctx := context.Background()

client, err := bigquery.NewClient(ctx, proj)
if err != nil {
	return nil, err
}

Java

使用 BigQueryOptions.getDefaultInstance() 函数来使用默认身份验证选项。使用 BigQueryOptions.getService() 函数创建 BigQuery 客户端。

BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();

Node.js

实例化 BigQuery 类以创建 BigQuery 客户端。

// Create a client
const bigqueryClient = new BigQuery();

PHP

实例化 BigQueryClient 类以创建 BigQuery 客户端。

$bigQuery = new BigQueryClient([
    'projectId' => $projectId,
]);

Python

实例化 bigquery.Client 类以创建 BigQuery 客户端。

client = bigquery.Client()

Ruby

使用 Google::Cloud::Bigquery.new 函数创建 BigQuery 客户端。

# This uses Application Default Credentials to authenticate.
# @see https://cloud.google.com/bigquery/docs/authentication/getting-started
bigquery = Google::Cloud::Bigquery.new

运行查询

查询 Stack Overflow 公共数据集,以找到使用 google-bigquery 标记的最常见问题

SELECT
  CONCAT(
    'https://stackoverflow.com/questions/',
    CAST(id as STRING)) as url,
  view_count
FROM `bigquery-public-data.stackoverflow.posts_questions`
WHERE tags like '%google-bigquery%'
ORDER BY view_count DESC
LIMIT 10

该查询采用标准 SQL 语法;如需了解此语法,请参阅查询参考指南。客户端库默认使用标准 SQL 语法。请参阅启用标准 SQL 以更改 SQL 方言。

运行查询

使用经过身份验证的 BigQuery 客户端进行查询。

C#

定义查询字符串,然后使用 client.ExecuteQuery() 函数提交查询并获取结果。

string query = @"SELECT
    CONCAT(
        'https://stackoverflow.com/questions/',
        CAST(id as STRING)) as url, view_count
    FROM `bigquery-public-data.stackoverflow.posts_questions`
    WHERE tags like '%google-bigquery%'
    ORDER BY view_count DESC
    LIMIT 10";
var result = client.ExecuteQuery(query, parameters: null);

Go

使用 bigquery.Query() 函数定义查询,然后使用 Query.Read() 函数提交查询并获取结果。

query := client.Query(
	`SELECT
		CONCAT(
			'https://stackoverflow.com/questions/',
			CAST(id as STRING)) as url,
		view_count
	FROM ` + "`bigquery-public-data.stackoverflow.posts_questions`" + `
	WHERE tags like '%google-bigquery%'
	ORDER BY view_count DESC
	LIMIT 10;`)
return query.Read(ctx)

Java

通过 QueryJobConfiguration 实例定义查询。通过 BigQuery.create() 方法启动查询作业。

QueryJobConfiguration queryConfig =
    QueryJobConfiguration.newBuilder(
      "SELECT "
          + "CONCAT('https://stackoverflow.com/questions/', CAST(id as STRING)) as url, "
          + "view_count "
          + "FROM `bigquery-public-data.stackoverflow.posts_questions` "
          + "WHERE tags like '%google-bigquery%' "
          + "ORDER BY favorite_count DESC LIMIT 10")
        // Use standard SQL syntax for queries.
        // See: https://cloud.google.com/bigquery/sql-reference/
        .setUseLegacySql(false)
        .build();

// Create a job ID so that we can safely retry.
JobId jobId = JobId.of(UUID.randomUUID().toString());
Job queryJob = bigquery.create(JobInfo.newBuilder(queryConfig).setJobId(jobId).build());

// Wait for the query to complete.
queryJob = queryJob.waitFor();

// Check for errors
if (queryJob == null) {
  throw new RuntimeException("Job no longer exists");
} else if (queryJob.getStatus().getError() != null) {
  // You can also look at queryJob.getStatus().getExecutionErrors() for all
  // errors, not just the latest one.
  throw new RuntimeException(queryJob.getStatus().getError().toString());
}

Node.js

使用 BigQuery.query() 方法启动查询。

// The SQL query to run
const sqlQuery = `SELECT
  CONCAT(
    'https://stackoverflow.com/questions/',
    CAST(id as STRING)) as url,
  view_count
  FROM \`bigquery-public-data.stackoverflow.posts_questions\`
  WHERE tags like '%google-bigquery%'
  ORDER BY view_count DESC
  LIMIT 10`;

const options = {
  query: sqlQuery,
  // Location must match that of the dataset(s) referenced in the query.
  location: 'US',
};

// Run the query
const [rows] = await bigqueryClient.query(options);

PHP

创建查询配置,然后使用 BigQueryClient.startQuery() 方法启动查询。

$query = <<<ENDSQL
SELECT
  CONCAT(
    'https://stackoverflow.com/questions/',
    CAST(id as STRING)) as url,
  view_count
FROM `bigquery-public-data.stackoverflow.posts_questions`
WHERE tags like '%google-bigquery%'
ORDER BY view_count DESC
LIMIT 10;
ENDSQL;
$queryJobConfig = $bigQuery->query($query);
$queryResults = $bigQuery->runQuery($queryJobConfig);

Python

使用 Client.query() 方法启动查询。

query_job = client.query("""
    SELECT
      CONCAT(
        'https://stackoverflow.com/questions/',
        CAST(id as STRING)) as url,
      view_count
    FROM `bigquery-public-data.stackoverflow.posts_questions`
    WHERE tags like '%google-bigquery%'
    ORDER BY view_count DESC
    LIMIT 10""")

results = query_job.result()  # Waits for job to complete.

Ruby

使用 Google::Cloud::Bigquery::Project.query 函数启动查询并等待结果。

sql     = "SELECT " +
          "CONCAT('https://stackoverflow.com/questions/', " +
          "       CAST(id as STRING)) as url, view_count " +
          "FROM `bigquery-public-data.stackoverflow.posts_questions` " +
          "WHERE tags like '%google-bigquery%' " +
          "ORDER BY view_count DESC LIMIT 10"
results = bigquery.query sql

如需查看运行 BigQuery 查询的更多示例,请参阅:

显示查询结果

显示查询结果。

C#

Console.Write("\nQuery Results:\n------------\n");
foreach (var row in result)
{
    Console.WriteLine($"{row["url"]}: {row["view_count"]} views");
}

Go

使用 RowIterator.Next() 函数将每一行加载到一个结构体指针。

type StackOverflowRow struct {
	URL       string `bigquery:"url"`
	ViewCount int64  `bigquery:"view_count"`
}

// printResults prints results from a query to the Stack Overflow public dataset.
func printResults(w io.Writer, iter *bigquery.RowIterator) error {
	for {
		var row StackOverflowRow
		err := iter.Next(&row)
		if err == iterator.Done {
			return nil
		}
		if err != nil {
			return err
		}

		fmt.Fprintf(w, "url: %s views: %d\n", row.URL, row.ViewCount)
	}
}

Java

遍历 QueryResponse,获取结果中的所有行。迭代器会自动处理分页。每个 FieldList 按数字索引或列名称公开提供列。

// Get the results.
TableResult result = queryJob.getQueryResults();

// Print all pages of the results.
for (FieldValueList row : result.iterateAll()) {
  String url = row.get("url").getStringValue();
  long viewCount = row.get("view_count").getLongValue();
  System.out.printf("url: %s views: %d%n", url, viewCount);
}

Node.js

查询结果作为行列表返回,其中每一行都是一个字典。

console.log('Query Results:');
rows.forEach(row => {
  const url = row['url'];
  const viewCount = row['view_count'];
  console.log(`url: ${url}, ${viewCount} views`);
});

PHP

调用 Job.queryResults() 方法以等待查询完成。查询结果中的每一行都是一个关联数组。

if ($queryResults->isComplete()) {
    $i = 0;
    $rows = $queryResults->rows();
    foreach ($rows as $row) {
        printf('--- Row %s ---' . PHP_EOL, ++$i);
        printf('url: %s, %s views' . PHP_EOL, $row['url'], $row['view_count']);
    }
    printf('Found %s row(s)' . PHP_EOL, $i);
} else {
    throw new Exception('The query failed to complete');
}

Python

遍历 RowIterator,获取结果中的所有行。迭代器会自动处理分页。每一行按数字索引、列名称或以 Python 属性的形式公开提供列。

for row in results:
    print("{} : {} views".format(row.url, row.view_count))

Ruby

Google::Cloud::Bigquery::Data 类以哈希形式公开提供每一行。

results.each do |row|
  puts "#{row[:url]}: #{row[:view_count]} views"
end

详细了解如何在 BigQuery 中处理数据行:

完整源代码

以下是该示例的完整源代码。

C#


using System;
using Google.Cloud.BigQuery.V2;

namespace GoogleCloudSamples
{
    public class Program
    {
        public static void Main(string[] args)
        {
            string projectId = "YOUR-PROJECT-ID";
            var client = BigQueryClient.Create(projectId);
            string query = @"SELECT
                CONCAT(
                    'https://stackoverflow.com/questions/',
                    CAST(id as STRING)) as url, view_count
                FROM `bigquery-public-data.stackoverflow.posts_questions`
                WHERE tags like '%google-bigquery%'
                ORDER BY view_count DESC
                LIMIT 10";
            var result = client.ExecuteQuery(query, parameters: null);
            Console.Write("\nQuery Results:\n------------\n");
            foreach (var row in result)
            {
                Console.WriteLine($"{row["url"]}: {row["view_count"]} views");
            }
        }
    }
}

Go

import (
	"context"
	"fmt"
	"io"
	"log"
	"os"

	"cloud.google.com/go/bigquery"
	"google.golang.org/api/iterator"
)

func main() {
	proj := os.Getenv("GOOGLE_CLOUD_PROJECT")
	if proj == "" {
		fmt.Println("GOOGLE_CLOUD_PROJECT environment variable must be set.")
		os.Exit(1)
	}

	rows, err := query(proj)
	if err != nil {
		log.Fatal(err)
	}
	if err := printResults(os.Stdout, rows); err != nil {
		log.Fatal(err)
	}
}

// query returns a slice of the results of a query.
func query(proj string) (*bigquery.RowIterator, error) {
	ctx := context.Background()

	client, err := bigquery.NewClient(ctx, proj)
	if err != nil {
		return nil, err
	}

	query := client.Query(
		`SELECT
			CONCAT(
				'https://stackoverflow.com/questions/',
				CAST(id as STRING)) as url,
			view_count
		FROM ` + "`bigquery-public-data.stackoverflow.posts_questions`" + `
		WHERE tags like '%google-bigquery%'
		ORDER BY view_count DESC
		LIMIT 10;`)
	return query.Read(ctx)
}

type StackOverflowRow struct {
	URL       string `bigquery:"url"`
	ViewCount int64  `bigquery:"view_count"`
}

// printResults prints results from a query to the Stack Overflow public dataset.
func printResults(w io.Writer, iter *bigquery.RowIterator) error {
	for {
		var row StackOverflowRow
		err := iter.Next(&row)
		if err == iterator.Done {
			return nil
		}
		if err != nil {
			return err
		}

		fmt.Fprintf(w, "url: %s views: %d\n", row.URL, row.ViewCount)
	}
}

Java


import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.FieldValueList;
import com.google.cloud.bigquery.Job;
import com.google.cloud.bigquery.JobId;
import com.google.cloud.bigquery.JobInfo;
import com.google.cloud.bigquery.QueryJobConfiguration;
import com.google.cloud.bigquery.QueryResponse;
import com.google.cloud.bigquery.TableResult;
import java.util.UUID;

public class SimpleApp {
  public static void main(String... args) throws Exception {
    BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
    QueryJobConfiguration queryConfig =
        QueryJobConfiguration.newBuilder(
          "SELECT "
              + "CONCAT('https://stackoverflow.com/questions/', CAST(id as STRING)) as url, "
              + "view_count "
              + "FROM `bigquery-public-data.stackoverflow.posts_questions` "
              + "WHERE tags like '%google-bigquery%' "
              + "ORDER BY favorite_count DESC LIMIT 10")
            // Use standard SQL syntax for queries.
            // See: https://cloud.google.com/bigquery/sql-reference/
            .setUseLegacySql(false)
            .build();

    // Create a job ID so that we can safely retry.
    JobId jobId = JobId.of(UUID.randomUUID().toString());
    Job queryJob = bigquery.create(JobInfo.newBuilder(queryConfig).setJobId(jobId).build());

    // Wait for the query to complete.
    queryJob = queryJob.waitFor();

    // Check for errors
    if (queryJob == null) {
      throw new RuntimeException("Job no longer exists");
    } else if (queryJob.getStatus().getError() != null) {
      // You can also look at queryJob.getStatus().getExecutionErrors() for all
      // errors, not just the latest one.
      throw new RuntimeException(queryJob.getStatus().getError().toString());
    }

    // Get the results.
    TableResult result = queryJob.getQueryResults();

    // Print all pages of the results.
    for (FieldValueList row : result.iterateAll()) {
      String url = row.get("url").getStringValue();
      long viewCount = row.get("view_count").getLongValue();
      System.out.printf("url: %s views: %d%n", url, viewCount);
    }
  }
}

Node.js

// Import the Google Cloud client library
const {BigQuery} = require('@google-cloud/bigquery');

async function queryStackOverflow() {
  // Queries a public Stack Overflow dataset.

  // Create a client
  const bigqueryClient = new BigQuery();

  // The SQL query to run
  const sqlQuery = `SELECT
    CONCAT(
      'https://stackoverflow.com/questions/',
      CAST(id as STRING)) as url,
    view_count
    FROM \`bigquery-public-data.stackoverflow.posts_questions\`
    WHERE tags like '%google-bigquery%'
    ORDER BY view_count DESC
    LIMIT 10`;

  const options = {
    query: sqlQuery,
    // Location must match that of the dataset(s) referenced in the query.
    location: 'US',
  };

  // Run the query
  const [rows] = await bigqueryClient.query(options);

  console.log('Query Results:');
  rows.forEach(row => {
    const url = row['url'];
    const viewCount = row['view_count'];
    console.log(`url: ${url}, ${viewCount} views`);
  });
}
queryStackOverflow();

PHP

require __DIR__ . '/vendor/autoload.php';

use Google\Cloud\BigQuery\BigQueryClient;

// get the project ID as the first argument
if (2 != count($argv)) {
    die("Usage: php stackoverflow.php YOUR_PROJECT_ID\n");
}

$projectId = $argv[1];

$bigQuery = new BigQueryClient([
    'projectId' => $projectId,
]);
$query = <<<ENDSQL
SELECT
  CONCAT(
    'https://stackoverflow.com/questions/',
    CAST(id as STRING)) as url,
  view_count
FROM `bigquery-public-data.stackoverflow.posts_questions`
WHERE tags like '%google-bigquery%'
ORDER BY view_count DESC
LIMIT 10;
ENDSQL;
$queryJobConfig = $bigQuery->query($query);
$queryResults = $bigQuery->runQuery($queryJobConfig);

if ($queryResults->isComplete()) {
    $i = 0;
    $rows = $queryResults->rows();
    foreach ($rows as $row) {
        printf('--- Row %s ---' . PHP_EOL, ++$i);
        printf('url: %s, %s views' . PHP_EOL, $row['url'], $row['view_count']);
    }
    printf('Found %s row(s)' . PHP_EOL, $i);
} else {
    throw new Exception('The query failed to complete');
}

Python

from google.cloud import bigquery

def query_stackoverflow():
    client = bigquery.Client()
    query_job = client.query("""
        SELECT
          CONCAT(
            'https://stackoverflow.com/questions/',
            CAST(id as STRING)) as url,
          view_count
        FROM `bigquery-public-data.stackoverflow.posts_questions`
        WHERE tags like '%google-bigquery%'
        ORDER BY view_count DESC
        LIMIT 10""")

    results = query_job.result()  # Waits for job to complete.

    for row in results:
        print("{} : {} views".format(row.url, row.view_count))

if __name__ == '__main__':
    query_stackoverflow()

Ruby

require "google/cloud/bigquery"

# This uses Application Default Credentials to authenticate.
# @see https://cloud.google.com/bigquery/docs/authentication/getting-started
bigquery = Google::Cloud::Bigquery.new

sql     = "SELECT " +
          "CONCAT('https://stackoverflow.com/questions/', " +
          "       CAST(id as STRING)) as url, view_count " +
          "FROM `bigquery-public-data.stackoverflow.posts_questions` " +
          "WHERE tags like '%google-bigquery%' " +
          "ORDER BY view_count DESC LIMIT 10"
results = bigquery.query sql

results.each do |row|
  puts "#{row[:url]}: #{row[:view_count]} views"
end

恭喜!您已向 BigQuery 发送了您的第一个请求。

后续步骤

详细了解我们的 Google BigQuery API 客户端库

此页内容是否有用?请给出您的反馈和评价:

发送以下问题的反馈:

此网页
需要帮助?请访问我们的支持页面