Querying massive datasets can be time consuming and expensive without the right hardware and infrastructure. Google BigQuery solves this problem by enabling super-fast SQL queries against append-only tables using the processing power of Google's infrastructure. Simply move your data into BigQuery and let us handle the hard work. You can control access to both the project and your data based on your business needs, such as giving others the ability to view or query your data.
You can access BigQuery by using a web UI or a command-line tool, or by making calls to the BigQuery REST API using a variety of client libraries such as Java, .NET or Python. There are also a variety of third-party tools that you can use to interact with BigQuery, such as visualizing the data or loading the data.
There are four main concepts you should understand when using BigQuery.
Projects are top-level containers in Google Cloud Platform. They store information about billing and authorized users, and they contain BigQuery data. Each project has a friendly name and a unique ID.
BigQuery bills on a per-project basis, so it’s usually easiest to create a single project for your company that’s maintained by your billing department. To enable billing, see Sign Up for BigQuery. For more information on how to grant access to your project, see Access Control.
Tables contain your data in BigQuery. Each table has a schema that describes field names, types, and other information. In addition to tables containing data stored in managed storage, BigQuery also supports both views, which are virtual tables defined by a SQL query, and external tables, which are tables defined over data stored in, for example, Cloud Storage.
Datasets allow you to organize and control access to your tables. Because tables are contained in datasets, you'll need to create at least one dataset before loading data into BigQuery.
You share BigQuery data with others by setting ACLs on datasets, not on the tables within them. For more information, see Access Control.
Jobs are actions you construct and BigQuery executes on your behalf to load data, export data, query data, or copy data. Since jobs can potentially take a long time to complete, they execute asynchronously and can be polled for their status. BigQuery saves a history of all jobs associated with a project, accessible via the Google Cloud Platform Console.
Interacting with BigQuery
There are three main ways you interact with BigQuery.
Loading and exporting data
In most cases, you load data into BigQuery Storage. If you want to get the data back out of BigQuery, you can export the data. You can also set up a table as a federated data source which allows you to use a query to transform your data as you load it.
Querying and viewing data
Once you load your data into BigQuery, there are a few ways to query or view the data in your tables:
- Calling the bigquery.jobs.query() method
- Calling the bigquery.jobs.insert() method with a query configuration
In addition to querying and viewing data, you can manage data in BigQuery in the following ways:
- Listing projects, jobs, tables and datasets
- Getting information about jobs, tables and datasets
- Defining, updating or patching tables and datasets
- Deleting tables and datasets
For more information, see the API reference.