Stay organized with collections Save and categorize content based on your preferences.

Get query performance insights

This document describes how to use the query execution graph to diagnose query performance issues, and to see query performance insights.

BigQuery offers strong query performance, but it is also a complex distributed system with many internal and external factors that can affect query speed. The declarative nature of the SQL language can also hide the complexity of query execution. This means that when your queries are running slower than anticipated, or slower than prior runs, understanding what happened can be a challenge.

The query execution graph provides an intuitive interface for inspecting query performance details. By using it, you can review the query plan information in graphical format for any query, whether running or completed.

You can also use the query execution graph to get performance insights for queries. Performance insights provide best-effort suggestions to help you improve query performance. Since query performance is multi-faceted, performance insights might only provide a partial picture of the overall query performance.

Required permissions

To use the query execution graph, you must have the following permissions:

  • bigquery.jobs.get
  • bigquery.jobs.listAll

These permissions are available through the following BigQuery predefined Identity and Access Management (IAM) roles:

  • roles/bigquery.admin
  • roles/bigquery.resourceAdmin
  • roles/bigquery.resourceEditor
  • roles/bigquery.resourceViewer

View query execution details

Follow these steps to see query execution details:

  1. Open the BigQuery page in the Google Cloud console.

    Go to the BigQuery page

  2. In the Editor, click either Personal History or Project History.

  3. In the list of jobs, identify the query job that interests you. Click Actions, and choose Open query in editor.

  4. Select the Execution Graph tab to see a graphical representation of each stage of the query:

    The graphical query plan in the execution graph.

    To determine if a query stage has performance insights, look at the icon it displays. Stages that have a information icon have performance insights. Stages that have a check icon don't.

  5. To see details for a stage, click on it to open the Overview pane. There you can see query plan information for the stage, and any applicable performance insights:

    Query stage details.

  6. Optional: If you are inspecting a running query, click Sync to update the execution graph so that it reflects the query's current status.

    Sync the graph to a running query.

  7. Optional: To display information in the execution graph about slot time consumed, stage duration, bytes shuffled, and bytes spilled to disk for each stage of the query, click Show execution info .

    Show stage execution details on the graph.

  8. Optional: To have the top stages by stage duration highlighted on the graph, click Highlight top stages by duration.

    Show top stages by duration.

  9. Optional: To have the top stages by slot time used highlighted on the graph, click Highlight top stages by processing.

    Show top stages by processing.

For any query that has performance regression issues, performance insights are also displayed on the Job Information tab for the query:

The job information tab.

You can also get query execution details in a non-graphical format by calling the jobs.list API method and inspecting the JobStatistics2 information that is returned.

Interpret query performance insights

Use this section to learn more about what performance insights mean and how to address them.

Performance insights are intended for two audiences:

  • Analysts: you run queries in a project. You are interested in finding out why a query you have run before is unexpectedly running slower, and in getting tips on how to improve a query's performance. You have the permissions described in Required permissions.

  • Data lake or data warehouse administrators: you manage your organization's BigQuery resources and reservations. You have the permissions associated with the BigQuery Admin role.

Each of the following sections provides guidance on what you can do to address a performance insight you receive, based on which of these roles you occupy.

Slot contention

When you run a query, BigQuery attempts to break up the work needed by your query into tasks. A task is a single slice of data that is input into and output from a stage. A single slot picks up a task and executes that slice of data for the stage. Ideally, BigQuery slots execute these tasks in parallel to achieve high performance. Slot contention occurs when your query has many tasks ready to start executing, but BigQuery can't get enough available slots to execute them.

What to do if you're an analyst

Reduce the data you are processing in your query by following the guidance in Reduce data processed in queries.

What to do if you're an administrator

Increase slot availability or decrease slot usage by taking the following actions:

  • If you use BigQuery's on-demand pricing, your queries use a shared pool of slots. Consider switching to flat-rate analysis pricing by purchasing reservations instead. Reservations let you reserve dedicated slots for your organization's queries.
  • If you are using BigQuery reservations, ensure that there are enough slots in the reservation that is assigned to the project that was running the query. The reservation might not have enough slots in these scenarios:

    • There are other jobs that are consuming reservation slots. You can use Admin Resource Charts to see how your organization is using the reservation.
    • The reservation does not have enough assigned slots to run queries fast enough. You can use the slot estimator to get an estimate of how large your reservations should be to efficiently process your queries' tasks.

    To address this, you can try one of the following solutions:

    • Add more slots to that reservation.
    • Create an additional reservation and assign it to the project running the query.
    • Spread out resource-intensive queries, either over time within a reservation or over different reservations.
  • Ensure that the tables you are querying are clustered. Clustering helps to ensure that BigQuery can quickly read columns with correlated data.

  • Ensure that the tables you are querying are partitioned. For unpartitioned tables, BigQuery reads the entire table. Partitioning your tables helps ensure that you query only the subset of your tables that you are interested in.

Insufficient shuffle quota

Before running your query, BigQuery breaks up your query's logic into stages. BigQuery slots execute the tasks for each stage. When a slot completes the execution of a stage's tasks, it stores the intermediate results in shuffle. Subsequent stages in your query read data from shuffle to continue your query's execution. Insufficient shuffle quota occurs when you have more data that needs to get written to shuffle than you have shuffle capacity.

What to do if you're an analyst

Similarly to slot contention, reducing the amount of data that your query processes might reduce shuffle usage. To do this, follow the guidance in Reduce data processed in queries.

Certain operations in SQL tend to make more extensive usage of shuffle, particularly JOIN operations and GROUP BY clauses. Where possible, reducing the amount of data in these operations might reduce shuffle usage.

What to do if you're an administrator

Reduce shuffle quota contention by taking the following actions:

  • Similarly to slot contention, if you use BigQuery's on-demand pricing, your queries use a shared pool of slots. Consider switching to flat-rate analysis pricing by purchasing reservations instead. Reservations give you dedicated slots and shuffle capacity for your projects' queries.
  • If you are using BigQuery reservations, slots come with dedicated shuffle capacity. If your reservation is running some queries that make extensive use of shuffle, this might cause other queries running in parallel to not get enough shuffle capacity. You can identify which jobs use shuffle capacity extensively by querying the period_shuffle_ram_usage_ratio column in the INFORMATION_SCHEMA.JOBS_TIMELINE view.

    To address this, you can try one or more of the following solutions:

    • Add more slots to that reservation.
    • Create an additional reservation and assign it to the project running the query.
    • Spread out shuffle-intensive queries, either over time within a reservation or over different reservations.

Data input scale change

Getting this performance insight indicates that your query is reading at least 50% more data for a given input table than the last time you ran the query. You can use table change history to see if the size of any of the tables used in the query has recently increased.

What to do if you're an analyst

Reduce the data you are processing in your query by following the guidance in Reduce data processed in queries.

Interpret query stage information

In addition to using query performance insights, you can also use the following guidelines when you are reviewing query stage details to help determine if there is an issue with a query:

  • If the Wait ms value for one or more stages is high compared to previous runs of the query:
    • See if you have enough slots available to accommodate your workload. If not, load-balance when you run resource-intensive queries so they don't compete with each other.
    • If the Wait ms value is higher than it has been for just one stage, look at the stage prior to it to see if a bottleneck has been introduced there. Things like substantial changes to the data or schema of the tables involved in the query might affect the query performance.
  • If the Shuffle output bytes value for a stage is high compared to previous runs of the query, or compared to a previous stage, evaluate the steps processed in that stage to see if any create unexpectedly large amounts of data. One common cause for this is when a step processes an INNER JOIN where there are duplicate keys on both sides of the join. This can return an unexpectedly large amount of data.
  • Use the execution graph to look at the top stages by duration and processing. Consider the amount of data they produce and whether it is commensurate with the size of the tables referenced in the query. If it isn't, review the steps in those stages to see if any of them might produce an unexpected amount of interim data.

What's next