Federated query functions

GoogleSQL for BigQuery supports the following federated query functions.

Function list

Name Summary
EXTERNAL_QUERY Executes a query on an external database and returns the results as a temporary table.

EXTERNAL_QUERY

EXTERNAL_QUERY('connection_id', '''external_database_query'''[, 'options'])

Description

Executes a query on an external database and returns the results as a temporary table. The external database data type is converted to a GoogleSQL data type in the temporary result table with these data type mappings.

  • external_database_query: The query to run on the external database.
  • connection_id: The ID of the connection resource. The connection resource contains settings for the connection between the external database and BigQuery. If you do not have a default project configured, prepend the project ID to the connection ID in following format:

    projects/PROJECT_ID/locations/LOCATION/connections/CONNECTION_ID
    

    Replace the following:

    • PROJECT_ID: The project ID.
    • LOCATION: The location of the connection.
    • CONNECTION_ID: The connection ID.

    For example, projects/example-project/locations/us/connections/sql-bq. For more information, see Create a connection resource.

+ options: An optional string of a JSON format map with key value pairs of option name and value (both are case sensitive).

For example::
``` '{"default_type_for_decimal_columns":"numeric"}' ```

Supported options:

|Option Name | Description
|-------- | -------
|"default_type_for_decimal_columns" | Can be "float64", "numeric", "bignumeric" or "string". With this option, the MySQL Decimal type or PostgreSQL Numeric type will be mapped to the provided BigQuery type. When this option is not provided, the MySQL Decimal type or PostgreSQL Numeric type will be mapped to BigQuery NUMERIC type.
|"query_execution_priority" | Can be "low", "medium" or "high". Only supported in Spanner. Specifies priority for execution of the query. Execution priority is "medium" by default.

Additional notes:

  • The EXTERNAL_QUERY function is usually used in a FROM clause.
  • You can use the EXTERNAL_QUERY() function to access metadata about the external database.
  • EXTERNAL_QUERY() won't honor the ordering of the external query result, even if your external query includes ORDER BY.

Return Data Type

BigQuery table

Examples

Suppose you need the date of the first order for each of your customers to include in a report. This data is not currently in BigQuery but is available in your operational PostgreSQL database in . The following federated query example accomplishes this and includes 3 parts:

  1. Run the external query SELECT customer_id, MIN(order_date) AS first_order_date FROM orders GROUP BY customer_id in the operational PostgreSQL database to get the first order date for each customer through the EXTERNAL_QUERY() function.
  2. Join external query result table with customers table in BigQuery by customer_id.
  3. Select customer information and first order date.
SELECT
  c.customer_id, c.name, SUM(t.amount) AS total_revenue, rq.first_order_date
FROM customers AS c
INNER JOIN transaction_fact AS t ON c.customer_id = t.customer_id
LEFT OUTER JOIN
  EXTERNAL_QUERY(
    'connection_id',
    '''SELECT customer_id, MIN(order_date) AS first_order_date
       FROM orders
       GROUP BY customer_id'''
  ) AS rq
  ON rq.customer_id = c.customer_id
GROUP BY c.customer_id, c.name, rq.first_order_date;

You can use the EXTERNAL_QUERY() function to query information_schema tables to access database metadata, such as list all tables in the database or show table schema. The following example information_schema queries work in both MySQL and PostgreSQL.

-- List all tables in a database.
SELECT *
FROM
  EXTERNAL_QUERY(
    'connection_id',
    '''SELECT * FROM information_schema.tables'''
  );
-- List all columns in a table.
SELECT *
FROM
  EXTERNAL_QUERY(
    'connection_id',
    '''SELECT * FROM information_schema.columns WHERE table_name='x';'''
  );

EXTERNAL_QUERY() won't honor the ordering of the external query result, even if your external query includes ORDER BY. The following example query orders rows by customer ID in the external database, but BigQuery will not output the result rows in that order.

-- ORDER BY will not order rows.
SELECT *
FROM
  EXTERNAL_QUERY(
    'connection_id',
    '''SELECT * FROM customers AS c ORDER BY c.customer_id'''
  );

Data type mappings

When you execute a federated query, the data from the external database are converted to GoogleSQL types. Below are the data type mappings from MySQL to BigQuery and PostgreSQL to BigQuery.

Things to know about mapping:

  • Most MySQL data types can be matched to the same BigQuery data type, with a few exceptions such as decimal, timestamp, and time.
  • PostgreSQL supports many non-standard data types which are not supported in BigQuery, for example money, path, uuid, boxer