Live spreadsheets in databases

The BigQuery team has developed the ability to treat a Google Sheet like a table in a database. That means you can join a billion-row BigQuery table with a 100-row mapping table that you type into a spreadsheet in a single query.

This feature lets you configure data models, create a data pipeline for ad hoc data, or compare up-to-date numbers to your latest goals and projections.

Example

For example, you might want to model usernames in Looker that were collected using a Google form.

To collect and model the information in Looker, you would perform the following steps:

  1. Using Google Forms, collect the data into a Google sheet. Every submission adds a row in the sheet.
  2. Share the spreadsheet with the service account that you use to connect to Looker, or grant access to users with a link (view only).
  3. Make sure that the Drive API and Sheets API are enabled for your project.
  4. Create a table from the BigQuery interface that gets its content from the spreadsheet.
  5. In your Looker project, generate a data model that joins the spreadsheet table with other user data that you might already have collected, such as users' locations. You can also use SQL to normalize the names that are coming from the database. The data can be cached in BigQuery to avoid heavy load on the spreadsheet.

The following is an example of what the resulting LookML might look like:

explore: names_sheet {
  persist_for: "60 seconds"

  join: names_facts {
    sql_on: ${names_sheet.normalized_name} = ${names_facts.normalized_name} ;;
    sql_where: ${names_facts.city} ;;
    relationship: one_to_one
  }

  view_name: names_sheet {
    derived_table: {
      persist_for: "2 minutes"
      sql:
        SELECT row_number() OVER() as id, name, UPPER(CASE WHEN REGEXP_MATCH(name, r'\,')
          THEN REGEXP_EXTRACT(name, r', (\w+)')
          ELSE REGEXP_EXTRACT(name, r'^(\w+)')
          END
          ) as normalized_name FROM namesheet.names ;;
    }

    dimension: id {
      type: number
    }

    dimension: name {
      order_by_field: id    # keep the rows in the original order
    }

    dimension: normalized_name {
    }

    measure: count {
      type: count
      drill_fields: [id, name, names_facts.city]
    }

    measure: count_from_new_york_city {
      type: count
      filters: [names_facts.city: "New York City"]
    }

    measure: percentage_from_new_york_city {
      type: number
      sql: ${count_from_new_york_city}/${count} ;;
      value_format_name: percent_2
    }

    measure: average_age_median {
      type: average
      sql: ${names_facts.age_median} ;;
      value_format: "0000"
    }
  }
}

From this model, you can create Explores with your data and build Looks and dashboards that show metrics about all the usernames that you've collected from Google Forms and entered into the Google Sheet, as well as other information about each individual user.