Cloud SQL federated queries
As a data analyst, you can query data in Cloud SQL from BigQuery using federated queries.
BigQuery Cloud SQL federation enables BigQuery to query data residing in Cloud SQL in real time, without copying or moving data. Query federation supports both MySQL (2nd generation) and PostgreSQL instances in Cloud SQL.
Alternatively, to replicate data into BigQuery, you can also use Cloud Data Fusion or Datastream. For more about using Cloud Data Fusion, see Replicating data from MySQL to BigQuery.
Before you begin
- Ensure that your BigQuery administrator has created a Cloud SQL connection and shared it with you.
-
To get the permissions that you need to query a Cloud SQL instance, ask your administrator to grant you the BigQuery Connection User (
roles/bigquery.connectionUser
) IAM role on your project. For more information about granting roles, see Manage access.
Query data
To send a federated query to Cloud SQL from a
Google Standard SQL query, use the
EXTERNAL_QUERY
function.
Suppose that you store a customer table in BigQuery, while
storing a sales table in Cloud SQL, and want to join the two tables in
a single query. The following example makes a federated query to a
Cloud SQL table named orders
and joins the results with a
BigQuery table named mydataset.customers
.
SELECT c.customer_id, c.name, rq.first_order_date
FROM mydataset.customers AS c
LEFT OUTER JOIN EXTERNAL_QUERY(
'us.connection_id',
'''SELECT customer_id, MIN(order_date) AS first_order_date
FROM orders
GROUP BY customer_id''') AS rq ON rq.customer_id = c.customer_id
GROUP BY c.customer_id, c.name, rq.first_order_date;
The example query includes 3 parts:
- Run the external query
SELECT customer_id, MIN(order_date) AS first_order_date FROM orders GROUP BY customer_id
in the operational PostgreSQL database to get the first order date for each customer through theEXTERNAL_QUERY()
function. - Join the external query result table with the customers table in
BigQuery by
customer_id
. - Select customer information and first order date.
View a Cloud SQL table schema
You can use the EXTERNAL_QUERY()
function to query information_schema tables
to access database metadata, such as list all tables in the database or show
table schema. The following example information_schema queries work in both
MySQL and PostgreSQL. You can learn more from
MySQL information_schema tables
and
PostgreSQL information_schema tables.
-- List all tables in a database.
SELECT * FROM EXTERNAL_QUERY("connection_id",
"select * from information_schema.tables;");
-- List all columns in a table.
SELECT * FROM EXTERNAL_QUERY("connection_id",
"select * from information_schema.columns where table_name='x';");
Connection details
The following table shows the Cloud SQL connection properties:
Property name | Value | Description |
---|---|---|
name |
string | Name of the connection resource in the format: project_id.location_id.connection_id. |
location |
string | Location of the connection, which is the same as the Cloud SQL instance location with the following exceptions: Cloud SQL us-central1 maps to BigQuery US, Cloud SQL europe-west1 maps to BigQuery EU. |
friendlyName |
string | A user-friendly display name for the connection. |
description |
string | Description of the connection. |
cloudSql.type |
string | Can be "POSTGRES" or "MYSQL". |
cloudSql.instanceId |
string | Name of the Cloud SQL instance, usually in the format of:Project-id:location-id:instance-id You can find the instance ID in the Cloud SQL instance detail page. |
cloudSql.database |
string | The Cloud SQL database that you want to connect to. |
cloudSql.serviceAccountId |
string | The service account configured to access the Cloud SQL database. |
The following table shows the properties for the Cloud SQL instance credential:
Property name | Value | Description |
---|---|---|
username |
string | Database username |
password |
string | Database password |
What's next
- Learn about federated queries.
- Learn about MySQL to BigQuery data type mapping.
- Learn about PostgreSQL to BigQuery data type mapping.
- Learn about unsupported data types.