Package com.google.cloud.bigquery (2.20.0)

A client for BigQuery – A fully managed, petabyte scale, low cost enterprise data warehouse for analytics.

A simple usage example showing how to create a table if it does not exist and load data into it. For the complete source code see CreateTableAndLoadData.java.


 BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
 TableId tableId = TableId.of("dataset", "table");
 Table table = bigquery.getTable(tableId);
 if (table == null) {
   System.out.println("Creating table " + tableId);
   Field integerField = Field.of("fieldName", Field.Type.integer());
   Schema schema = Schema.of(integerField);
   table = bigquery.create(TableInfo.of(tableId, StandardTableDefinition.of(schema)));
 }
 System.out.println("Loading data into table " + tableId);
 Job loadJob = table.load(FormatOptions.csv(), "gs://bucket/path");
 loadJob = loadJob.waitFor();
 if (loadJob.getStatus().getError() != null) {
   System.out.println("Job completed with errors");
 } else {
   System.out.println("Job succeeded");
 }
 

See Also: Google Cloud BigQuery

Classes

Acl

Access Control for a BigQuery Dataset. BigQuery uses ACLs to manage permissions on datasets. ACLs are not directly supported on tables. A table inherits its ACL from the dataset that contains it. Project roles affect your ability to run jobs or manage the project, while dataset roles affect how you can access or modify the data inside of a project. See Also: Access Control

Acl.DatasetAclEntity

Class for a BigQuery DatasetAclEntity ACL entity. Objects of this class represent a DatasetAclEntity from a different DatasetAclEntity to grant access to. Only views are supported for now. The role field is not required when this field is set. If that DatasetAclEntity is deleted and re-created, its access needs to be granted again via an update operation.

Acl.Domain

Class for a BigQuery Domain entity. Objects of this class represent a domain to grant access to. Any users signed in with the domain specified will be granted the specified access.

Acl.Entity

Base class for BigQuery entities that can be grant access to the dataset.

Acl.Group

Class for a BigQuery Group entity. Objects of this class represent a group to granted access to. A Group entity can be created given the group's email or can be a special group: #ofProjectOwners(), #ofProjectReaders(), #ofProjectWriters() or #ofAllAuthenticatedUsers().

Acl.IamMember

Class for a BigQuery IamMember entity. Objects of this class represent a iamMember to grant access to given the IAM Policy.

Acl.Role

Dataset roles supported by BigQuery. See Also: Dataset Roles

Acl.Routine

Class for a BigQuery Routine entity. Objects of this class represent a routine from a different datasetAclEntity to grant access to. Queries executed against that routine will have read access to views/tables/routines in this datasetAclEntity. Only UDF is supported for now. The role field is not required when this field is set. If that routine is updated by any user, access to the routine needs to be granted again via an update operation.

Acl.User

Class for a BigQuery User entity. Objects of this class represent a user to grant access to given the email address.

Acl.View

Class for a BigQuery View entity. Objects of this class represent a view from a different datasetAclEntity to grant access to. Queries executed against that view will have read access to tables in this datasetAclEntity. The role field is not required when this field is set. If that view is updated by any user, access to the view needs to be granted again via an update operation.

AvroOptions

Google BigQuery options for AVRO format. This class wraps some properties of AVRO files used by BigQuery to parse external data.

AvroOptions.Builder

BiEngineReason

BiEngineReason.Builder

BiEngineStats

BIEngineStatistics contains query statistics specific to the use of BI Engine.

BiEngineStats.Builder

BigQuery.DatasetDeleteOption

Class for specifying dataset delete options.

BigQuery.DatasetListOption

Class for specifying dataset list options.

BigQuery.DatasetOption

Class for specifying dataset get, create and update options.

BigQuery.IAMOption

BigQuery.JobListOption

Class for specifying job list options.

BigQuery.JobOption

Class for specifying table get and create options.

BigQuery.ModelListOption

Class for specifying table list options.

BigQuery.ModelOption

Class for specifying model get, create and update options.

BigQuery.QueryOption

BigQuery.QueryResultsOption

Class for specifying query results options.

BigQuery.RoutineListOption

Class for specifying routine list options.

BigQuery.RoutineOption

Class for specifying table get, create and update options.

BigQuery.TableDataListOption

Class for specifying table data list options.

BigQuery.TableListOption

Class for specifying table list options.

BigQuery.TableOption

Class for specifying table get, create and update options.

BigQueryDryRunResultImpl

BigQueryError

Google Cloud BigQuery Error. Objects of this class represent errors encountered by the BigQuery service while executing a request. A BigQuery Job that terminated with an error has a non-null JobStatus#getError(). A job can also encounter errors during its execution that do not cause the whole job to fail (see JobStatus#getExecutionErrors()). Similarly, queries and insert all requests can cause BigQuery errors that do not mean the whole operation failed (see JobStatus#getExecutionErrors() and InsertAllResponse#getInsertErrors()). When a BigQueryException is thrown the BigQuery Error that caused it, if any, can be accessed with BigQueryException#getError().

BigQueryErrorMessages

BigQueryErrorMessages.RetryRegExPatterns

BigQueryOptions

BigQueryOptions.Builder

BigQueryOptions.DefaultBigQueryFactory

BigQueryOptions.DefaultBigQueryRpcFactory

BigQueryResultImpl<T>

BigQueryResultStatsImpl

BigQueryRetryAlgorithm<ResponseT>

BigQueryRetryConfig

BigQueryRetryConfig.Builder

BigQueryRetryHelper

BigtableColumn

BigtableColumn.Builder

BigtableColumnFamily

List of column families to expose in the table schema along with their types. This list restricts the column families that can be referenced in queries and specifies their value types.

You can use this list to do type conversions - see the 'type' field for more details. If you leave this list empty, all column families are present in the table schema and their values are read as BYTES. During a query only the column families referenced in that query are read from Bigtable.

BigtableColumnFamily.Builder

BigtableOptions

BigtableOptions.Builder

A builder for BigtableOptions objects.

Clustering

Clustering.Builder

ConnectionProperty

ConnectionProperty.Builder

A builder for ConnectionProperty objects.

ConnectionSettings

ConnectionSettings for setting up a BigQuery query connection.

ConnectionSettings.Builder

CopyJobConfiguration

Google BigQuery copy job configuration. A copy job copies an existing table to another new or existing table. Copy job configurations have JobConfiguration.Type#COPY type.

CopyJobConfiguration.Builder

CsvOptions

Google BigQuery options for CSV format. This class wraps some properties of CSV files used by BigQuery to parse external data.

CsvOptions.Builder

Dataset

A Google BigQuery Dataset.

Objects of this class are immutable. Operations that modify the dataset like #update return a new object. To get a Dataset object with the most recent information use #reload. Dataset adds a layer of service-related functionality over DatasetInfo.

Dataset.Builder

A builder for Dataset objects.

DatasetId

Google BigQuery Dataset identity.

DatasetInfo

Google BigQuery Dataset information. A dataset is a grouping mechanism that holds zero or more tables. Datasets are the lowest level unit of access control; you cannot control access at the table level. See Also: Managing Jobs, Datasets, and Projects

DatasetInfo.Builder

A builder for DatasetInfo objects.

DatastoreBackupOptions

Google BigQuery options for Cloud Datastore backup.

DatastoreBackupOptions.Builder

DmlStats

Represents DML statistics information.

DmlStats.Builder

EmptyTableResult

EncryptionConfiguration

EncryptionConfiguration.Builder

ExecuteSelectResponse

ExecuteSelectResponse.Builder

ExternalTableDefinition

Google BigQuery external table definition. BigQuery's external tables are tables whose data reside outside of BigQuery but can be queried as normal BigQuery tables. External tables are experimental and might be subject to change or removed. See Also: Federated Data Sources

ExternalTableDefinition.Builder

ExtractJobConfiguration

Google BigQuery extract job configuration. An extract job exports a BigQuery table to Google Cloud Storage. The extract destination provided as URIs that point to objects in Google Cloud Storage. Extract job configurations have JobConfiguration.Type#EXTRACT type.

ExtractJobConfiguration.Builder

Field

Google BigQuery Table schema field. A table field has a name, a type, a mode and possibly a description.

Field.Builder

FieldList

Google BigQuery Table schema fields (columns). Each field has a unique name and index. Fields with duplicate names are not allowed in BigQuery schema.

FieldValue

Google BigQuery Table Field Value class. Objects of this class represent values of a BigQuery Table Field. A list of values forms a table row. Tables rows can be gotten as the result of a query or when listing table data.

FieldValueList

Google BigQuery Table Field Values class, which represents a row in returned query result (table row). Tables rows can be retrieved as a result of a query or when listing table data.

Depending on how a corresponding query was executed, each row (an instance of FieldValueList) may or may not contain related schema. If schema is not provided, the individual cells of the row will still be accessible by index but not by name.

FormatOptions

Base class for Google BigQuery format options. These class define the format of external data used by BigQuery, for either federated tables or load jobs.

Load jobs support the following formats: AVRO, CSV, DATASTORE_BACKUP, GOOGLE_SHEETS, JSON, ORC, PARQUET

Federated tables can be defined against following formats: AVRO, BIGTABLE, CSV, DATASTORE_BACKUP, GOOGLE_SHEETS, JSON

GoogleSheetsOptions

Google BigQuery options for the Google Sheets format.

GoogleSheetsOptions.Builder

HivePartitioningOptions

HivePartitioningOptions currently supported types include: AVRO, CSV, JSON, ORC and Parquet.

HivePartitioningOptions.Builder

InsertAllRequest

Google Cloud BigQuery insert all request. This class can be used to stream data into BigQuery one record at a time without needing to run a load job. This approach enables querying data without the delay of running a load job. There are several important trade-offs to consider before choosing an approach. See Also: Streaming Data into BigQuery

InsertAllRequest.Builder

InsertAllRequest.RowToInsert

A Google Big Query row to be inserted into a table. Each RowToInsert has an associated id used by BigQuery to detect duplicate insertion requests on a best-effort basis.

To ensure proper serialization of numeric data, it is recommended to supply values using a string-typed representation. Additionally, data for fields of LegacySQLTypeName#BYTES must be provided as a base64 encoded string.

Example usage of creating a row to insert:


 List<Long> repeatedFieldValue = Arrays.asList(1L, 2L);
 Map<String, Object> recordContent = new HashMap<String, Object>();
 recordContent.put("subfieldName1", "value");
 recordContent.put("subfieldName2", repeatedFieldValue);
 Map<String, Object> rowContent = new HashMap<String, Object>();
 rowContent.put("booleanFieldName", true);
 rowContent.put("bytesFieldName", "DQ4KDQ==");
 rowContent.put("recordFieldName", recordContent);
 rowContent.put("numericFieldName", "1298930929292.129593272");
 RowToInsert row = new RowToInsert("rowId", rowContent);
 

See Also: Data Consistency

InsertAllResponse

Google Cloud BigQuery insert all response. Objects of this class possibly contain errors for an InsertAllRequest. If a row failed to be inserted, the non-empty list of errors associated to that row's index can be obtained with InsertAllResponse#getErrorsFor(long). InsertAllResponse#getInsertErrors() can be used to return all errors caused by a InsertAllRequest as a map.

Job

A Google BigQuery Job.

Objects of this class are immutable. To get a Job object with the most recent information use #reload. Job adds a layer of service-related functionality over JobInfo.

Job.Builder

A builder for Job objects.

JobConfiguration

Base class for a BigQuery job configuration.

JobConfiguration.Builder<T,B>

Base builder for job configurations.

JobId

Google BigQuery Job identity.

JobId.Builder

JobInfo

Google BigQuery Job information. Jobs are objects that manage asynchronous tasks such as running queries, loading data, and exporting data. Use CopyJobConfiguration for a job that copies an existing table. Use ExtractJobConfiguration for a job that exports a table to Google Cloud Storage. Use LoadJobConfiguration for a job that loads data from Google Cloud Storage into a table. Use QueryJobConfiguration for a job that runs a query. See Also: Jobs

JobInfo.Builder

A builder for JobInfo objects.

JobStatistics

A Google BigQuery Job statistics.

JobStatistics.CopyStatistics

A Google BigQuery Copy Job statistics.

JobStatistics.ExtractStatistics

A Google BigQuery Extract Job statistics.

JobStatistics.LoadStatistics

A Google BigQuery Load Job statistics.

JobStatistics.QueryStatistics

A Google BigQuery Query Job statistics.

JobStatistics.QueryStatistics.StatementType

StatementType represents possible types of SQL statements reported as part of the QueryStatistics of a BigQuery job.

JobStatistics.ReservationUsage

ReservationUsage contains information about a job's usage of a single reservation.

JobStatistics.ReservationUsage.Builder

JobStatistics.ScriptStatistics

A Google BigQuery Script statistics.

JobStatistics.ScriptStatistics.ScriptStackFrame

JobStatistics.SessionInfo

JobStatistics.SessionInfo.Builder

JobStatistics.TransactionInfo

JobStatistics.TransactionInfo.Builder

JobStatus

A Google BigQuery Job status. Objects of this class can be examined when polling an asynchronous job to see if the job completed.

JobStatus.State

Possible states that a BigQuery Job can assume.

LegacySQLTypeName

A type used in legacy SQL contexts. NOTE: some contexts use a mix of types; for example, for queries that use standard SQL, the return types are the legacy SQL types. See Also: https://cloud.google.com/bigquery/data-types

LoadJobConfiguration

Google BigQuery load job configuration. A load job loads data from one of several formats into a table. Data is provided as URIs that point to objects in Google Cloud Storage. Load job configurations have JobConfiguration.Type#LOAD type.

LoadJobConfiguration.Builder

MaterializedViewDefinition

MaterializedViewDefinition.Builder

Model

A Google BigQuery ML Model.

Objects of this class are immutable. Operations that modify the table like #update return a new object. To get a Model object with the most recent information use #reload.

Model.Builder

ModelId

ModelInfo

Google BigQuery ML model information. Models are not created directly via the API, but by issuing a CREATE MODEL query. See Also: CREATE MODEL statement

ModelInfo.Builder

A builder for ModelInfo objects.

ModelTableDefinition

A Google BigQuery Model table definition. This definition is used to represent a BigQuery ML model. See Also: BigQuery ML Model

ModelTableDefinition.Builder

Parameter

Parameter.Builder

ParquetOptions

ParquetOptions.Builder

A builder for ParquetOptions objects.

PolicyTags

PolicyTags.Builder

QueryJobConfiguration

Google BigQuery Query Job configuration. A Query Job runs a query against BigQuery data. Query job configurations have JobConfiguration.Type#QUERY type.

QueryJobConfiguration.Builder

QueryParameterValue

A value for a QueryParameter along with its type.

A static factory method is provided for each of the possible types (e.g. #int64(Long) for StandardSQLTypeName.INT64). Alternatively, an instance can be constructed by calling #of(Object, Class) with the value and a Class object, which will use these mappings:

  • Boolean: StandardSQLTypeName.BOOL
  • String: StandardSQLTypeName.STRING
  • Integer: StandardSQLTypeName.INT64
  • Long: StandardSQLTypeName.INT64
  • Double: StandardSQLTypeName.FLOAT64
  • Float: StandardSQLTypeName.FLOAT64
  • BigDecimal: StandardSQLTypeName.NUMERIC
  • BigNumeric: StandardSQLTypeName.BIGNUMERIC
  • JSON: StandardSQLTypeName.JSON
  • INTERVAL: StandardSQLTypeName.INTERVAL

No other types are supported through that entry point. The other types can be created by calling #of(Object, StandardSQLTypeName) with the value and a particular StandardSQLTypeName enum value.

Struct parameters are currently not supported.

QueryParameterValue.Builder

QueryResponse

QueryStage

BigQuery provides diagnostic information about a completed query's execution plan (or query plan for short). The query plan describes a query as a series of stages, with each stage comprising a number of steps that read from data sources, perform a series of transformations on the input, and emit an output to a future stage (or the final result). This class contains information on a query stage. See Also: Query Plan

QueryStage.QueryStep

Each query stage is made of a number of steps. This class contains information on a query step. See Also: Steps Metadata

RangePartitioning

RangePartitioning.Builder

A builder for RangePartitioning objects.

RangePartitioning.Range

RangePartitioning.Range.Builder

A builder for Range objects.

ReadClientConnectionConfiguration

Represents BigQueryStorage Read client connection information.

ReadClientConnectionConfiguration.Builder

RemoteFunctionOptions

Represents Remote Function Options. Options for a remote user-defined function.

RemoteFunctionOptions.Builder

Routine

A Google BigQuery Routine.

Objects of this class are immutable. Operations that modify the routine like #update return a new object. To get a routine object with the most recent information use #reload.

Routine.Builder

RoutineArgument

An argument for a BigQuery Routine.

RoutineArgument.Builder

RoutineId

RoutineId represents the identifier for a given Routine.

RoutineInfo

Google BigQuery routine information. A Routine is an API abstraction that encapsulates several related concepts inside the BigQuery service, including scalar user defined functions (UDFS) and stored procedures.

For more information about the REST representation of routines, see: https://cloud.google.com/bigquery/docs/reference/rest/v2/routines

For more information about working with scalar functions, see: https://cloud.google.com/bigquery/docs/reference/standard-sql/user-defined-functions

RoutineInfo.Builder

Schema

This class represents the schema for a Google BigQuery Table or data source.

SnapshotTableDefinition

SnapshotTableDefinition.Builder

StandardSQLDataType

Represents Standard SQL data type information.

StandardSQLDataType.Builder

StandardSQLField

A Google BigQuery SQL Field.

StandardSQLField.Builder

StandardSQLStructType

A set of fields contained within a SQL STRUCT in Google BigQuery.

StandardSQLStructType.Builder

StandardSQLTableType

Represents Standard SQL table type information.

StandardSQLTableType.Builder

StandardTableDefinition

A Google BigQuery default table definition. This definition is used for standard, two-dimensional tables with individual records organized in rows, and a data type assigned to each column (also called a field). Individual fields within a record may contain nested and repeated children fields. Every table is described by a schema that describes field names, types, and other information. See Also: Managing Tables

StandardTableDefinition.Builder

StandardTableDefinition.StreamingBuffer

Google BigQuery Table's Streaming Buffer information. This class contains information on a table's streaming buffer as the estimated size in number of rows/bytes.

Table

A Google BigQuery Table.

Objects of this class are immutable. Operations that modify the table like #update return a new object. To get a Table object with the most recent information use #reload. Table adds a layer of service-related functionality over TableInfo.

Table.Builder

A builder for Table objects.

TableDataWriteChannel

WriteChannel implementation to stream data into a BigQuery table. Use #getJob() to get the job used to insert streamed data. Please notice that #getJob() returns null until the channel is closed.

TableDefinition

Base class for a Google BigQuery table definition.

TableDefinition.Builder<T,B>

Base builder for table definitions.

TableDefinition.Type

The table type.

TableId

Google BigQuery Table identity.

TableInfo

Google BigQuery table information. Use StandardTableDefinition to create simple BigQuery table. Use ViewDefinition to create a BigQuery view. Use ExternalTableDefinition to create a BigQuery a table backed by external data. See Also: Managing Tables

TableInfo.Builder

A builder for TableInfo objects.

TableResult

TimePartitioning

Objects of this class allow to configure table partitioning based on time. By dividing a large table into smaller partitions, you can improve query performance and reduce the number of bytes billed by restricting the amount of data scanned. See Also: Partitioned Tables

TimePartitioning.Builder

TimelineSample

A specific timeline sample. This instruments work progress at a given point in time, providing information about work units active/pending/completed as well as cumulative slot-milliseconds.

TimelineSample.Builder

UserDefinedFunction

Google BigQuery User Defined Function. BigQuery supports user-defined functions (UDFs) written in JavaScript. A UDF is similar to the "Map" function in a MapReduce: it takes a single row as input and produces zero or more rows as output. The output can potentially have a different schema than the input. See Also: User-Defined Functions

ViewDefinition

Google BigQuery view table definition. BigQuery's views are logical views, not materialized views, which means that the query that defines the view is re-executed every time the view is queried. See Also: Views

ViewDefinition.Builder

WriteChannelConfiguration

Google BigQuery Configuration for a load operation. A load configuration can be used to load data into a table with a com.google.cloud.WriteChannel (BigQuery#writer(WriteChannelConfiguration)).

WriteChannelConfiguration.Builder

Interfaces

BigQuery

An interface for Google Cloud BigQuery. See Also: Google Cloud BigQuery

BigQueryDryRunResult

BigQueryFactory

An interface for BigQuery factories.

BigQueryResult<T>

BigQueryResultStats

Connection

A Connection is a session between a Java application and BigQuery. SQL statements are executed and results are returned within the context of a connection.

LoadConfiguration

Common interface for a load configuration. A load configuration (WriteChannelConfiguration) can be used to load data into a table with a com.google.cloud.WriteChannel (BigQuery#writer(WriteChannelConfiguration)). A load configuration (LoadJobConfiguration) can also be used to create a load job (JobInfo#of(JobConfiguration)).

LoadConfiguration.Builder

Enums

Acl.Entity.Type

Types of BigQuery entities.

BigQuery.DatasetField

Fields of a BigQuery Dataset resource. See Also: Dataset Resource

BigQuery.JobField

Fields of a BigQuery Job resource. See Also: Job Resource

BigQuery.ModelField

Fields of a BigQuery Model resource. See Also: Model Resource

BigQuery.RoutineField

Fields of a BigQuery Routine resource. See Also: Routine Resource

BigQuery.TableField

Fields of a BigQuery Table resource. See Also: Table Resource

Field.Mode

Mode for a BigQuery Table field. Mode#NULLABLE fields can be set to null, Mode#REQUIRED fields must be provided. Mode#REPEATED fields can contain more than one value.

FieldValue.Attribute

The field value's attribute, giving information on the field's content type.

JobConfiguration.Type

Type of a BigQuery Job.

JobInfo.CreateDisposition

Specifies whether the job is allowed to create new tables.

JobInfo.SchemaUpdateOption

Specifies options relating to allowing the schema of the destination table to be updated as a side effect of the load or query job.

JobInfo.WriteDisposition

Specifies the action that occurs if the destination table already exists.

QueryJobConfiguration.Priority

Priority levels for a query. If not specified the priority is assumed to be Priority#INTERACTIVE.

StandardSQLTypeName

A type used in standard SQL contexts. For example, these types are used in queries with query parameters, which requires usage of standard SQL. See Also: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types

TimePartitioning.Type

[Optional] The supported types are DAY, HOUR, MONTH, and YEAR, which will generate one partition per day, hour, month, and year, respectively. When the interval is not specified, the default behavior is DAY.

UserDefinedFunction.Type

Type of user-defined function. User defined functions can be provided inline as code blobs (#INLINE) or as a Google Cloud Storage URI (#FROM_URI).

Exceptions

BigQueryException

BigQuery service exception. See Also: Google Cloud BigQuery error codes

BigQueryRetryHelper.BigQueryRetryHelperException

BigQuerySQLException

BigQuery service exception. See Also: Google Cloud BigQuery error codes

JobException

Exception describing a failure of a job.