Live spreadsheets in databases

Stay organized with collections Save and categorize content based on your preferences.

The BigQuery team has developed the ability to treat a Google Sheet like a table in a database. That means you can join a billion-row BigQuery table with a 100-row mapping table you type into a spreadsheet in a single query.

This feature lets you do all kinds of things, like configuring data models, pumping some ad hoc data into a data pipeline to see what comes out the other end, or comparing real numbers to your latest goals and projections.

Example

To show how it works, we will walk through an example.

Using Google Forms, collect data into a Google sheet. Every submission adds a row in the sheet. Currently, the contents of the spreadsheet look like this:

Once you have generated a model from your database, follow these steps:

  1. Share the spreadsheet with the service account you use to connect to Looker or make the spreadsheet available to all via link access (view only).
  2. Make sure the Drive API and Sheets API are enabled for your project.
  3. Create a table from the BigQuery interface that gets its content from the spreadsheet.
  4. In your Looker project, craft a simple data model joining the spreadsheet table with some data we might already know about the names, such as rank, population, and perhaps gender. We can also use SQL to normalize the names coming from the database. We will cache the data in BigQuery so that we don't hit the spreadsheet too hard.

    Here is an example.

    explore: names_sheet {
      persist_for: "60 seconds"
    
      join: names_facts {
        sql_on: ${names_sheet.normalized_name} = ${names_facts.normalized_name} ;;
        sql_where: ${names_facts.is_dominant_gender} ;;
        relationship: one_to_one
      }
    
      view_name: names_sheet {
        derived_table: {
          persist_for: "2 minutes"
          sql:
            SELECT row_number() OVER() as id, name, UPPER(CASE WHEN REGEXP_MATCH(name, r'\,')
              THEN REGEXP_EXTRACT(name, r', (\w+)')
              ELSE REGEXP_EXTRACT(name, r'^(\w+)')
              END
              ) as normalized_name FROM namesheet.names ;;
        }
    
        dimension: id {
          type: number
        }
    
        dimension: name {
          order_by_field: id    # keep the rows in the original order
        }
    
        dimension: normalized_name {
        }
    
        measure: count {
          type: count
          drill_fields: [id, name, names_facts.gender]
        }
    
        measure: count_male {
          type: count
          filters: [names_facts.gender: "M"]
        }
    
        measure: percentage_male {
          type: number
          sql: ${count_male}/${count} ;;
          value_format_name: percent_2
        }
    
        measure: average_year_median {
          type: average
          sql: ${names_facts.year_median} ;;
          value_format: "0000"
        }
      }
    }
    

From that model, you can build out a simple dashboard that shows some statistics about all the names entered into the spreadsheet as well as facts about each individual name. We have data flowing from an editable spreadsheet, right into a Looker query.