Creating a unified app analytics platform using Firebase, BigQuery, and Looker

This document shows you how to centralize your data sources in a data warehouse and dig deeper into customer behavior to make informed business decisions. This type of solution helps you track information across multiple channels, enabling your organization to determine, for example, how product enhancements affect customer usage behavior, or how feature and application fixes should be prioritized. This document is especially useful for digital native organizations, including software-as-a-service (SaaS), gaming, and ecommerce industries.

This document is intended for data analysts and developers, and it assumes that you are experienced with Structured Query Language (SQL) and have basic knowledge of the following:

  • BigQuery
  • Looker
  • Firebase
  • Your organization’s customer relationship management (CRM), issue tracking, and customer support tools

This document includes sections that describe how to address real-world business needs for a broader audience:

Procedure Audience Business goal(s)
Reporting Executives and account managers Visualize progress towards goals and track average daily active users.
Prioritizing bugs Technical program managers, product managers, and developers Identify crashes experienced by high-value customers and issues raised in chat support.
Managing accounts Sales representatives and account managers Identify customers affected by crashes and usage trends for your accounts.

Architecture

Architecture

In this architecture, BigQuery is the single source of truth for analytics, receiving data from various sources on a regular basis. Within Looker, data analysts combine Looker Blocks with custom first-party LookML models to analyze data for different users.

Looker is Google Cloud's business intelligence platform, which lets you build out live dashboards that give users the ability to drill into the details of data and take direct action on their findings.

This architecture uses the following data sources:

  • Google Analytics 4: Tracks customer interactions in your application.
  • Firebase Crashlytics: Collects and organizes Firebase application crash information.
  • Firestore: Provides a backend database for your Firebase application.
  • Google Sheets: Collects manually entered first-party data in spreadsheets.
  • Customer relationship management platform (CRM): Manages customer data. This document refers to Salesforce as an example; you can apply the same ideas to other tools.
  • Issue tracking or project management software: Helps teams track bug fixes and feature development. This document refers to JIRA as an example; you can apply the same ideas to other tools.
  • Customer support software or a help desk: These types of tools organize customer communications to help businesses respond to customers more quickly and effectively. This document refers to Zendesk as an example; you can apply the same ideas to other tools.

Costs

The procedures in this document use the following billable components of Google Cloud:

This document also refers to third-party data integration platforms and data sources, which have their own pricing models. See Fivetran pricing information.

Before you begin

  1. In the Google Cloud Console, go to the project selector page.

    Go to project selector

  2. Select or create a Google Cloud project.

  3. Enable billing on your project for all transfers. You are billed $0 for free transfers.

    Enabling billing is only required once per project, even if you are transferring data from multiple sources. Billing must also be enabled to query the data in BigQuery, after the data is transferred.

    Learn how to confirm that billing is enabled on your project.

  4. BigQuery is automatically enabled in new projects. To activate BigQuery in an existing project, enable the BigQuery API.

    Enable the BigQuery API

Preparing the environment

Centralizing data in BigQuery

The procedures in this section describe how to do the following:

  • Export your raw events from Google Analytics 4, as well as crash data.

  • Export from Firestore to BigQuery.

  • Query data in Google Sheets.

  • Export data with third-party tools.

Export from Google Analytics 4

You can export all of your raw events from Google Analytics 4 properties to BigQuery for more complex analyses. To export the Google Analytics 4 data, see BigQuery Export.

Export from Crashlytics

Crashlytics provides a great way to track and monitor crashes in your application. Exporting this crash data into BigQuery gives you more flexibility on the types of questions you can answer with this data. For information on exporting data into BigQuery, see Export Crashlytics Data to BigQuery.

In the Firebase Integrations tab, you can configure a connection to BigQuery. Optionally, you can toggle on streaming, which pushes crash data to BigQuery in real time.

Export Crashlytics

Export from Firestore to BigQuery

Firestore is a flexible, scalable database for mobile, web, and server development from Firebase and Google Cloud. To set up batch exports of documents from Firestore into BigQuery, which lets you incorporate data directly from your application into your analyses, see Loading data from Firestore exports. Alternatively, to use the Firebase extension, see Firestore BigQuery export.

Query data in Google Sheets

Many companies store business information that is manually entered or modified in spreadsheets, like Google Sheets. One example of this scenario is business goals, or revenue targets. It’s important to centralize this information into BigQuery so you can do things like track progress to goals. BigQuery supports creating an external table so that you can directly query and join data onto information stored in Google Sheets. To learn more, see Querying Drive data.

Export data with third-party tools

There are several data integration tools that export data with little or no code. Fivetran offers connectors from many different software-as-a-service (SaaS) applications, including CRM, issue tracking, and customer support software. To learn how to configure BigQuery as the destination for your sources, see BigQuery Setup Guide.

For information on setting up the sources used in this document, see the following Fivetran setup guides:

Unifying data sources

With your data piped into BigQuery, you may have a project that looks something like this, where each data source is in its own dataset:

example files in a project

To write queries that join tables from different datasets, use standardized IDs across platforms.

Map user IDs across data sources

The user ID from your application database, like Firestore, can be a good source of truth and may be used to populate the user IDs in other applications. By assigning a unique, persistent ID to each user, you can join data sources and present a cross-platform view of user behavior.

map user IDs across data sources

Each of the data sources discussed in Architecture can be configured so that a custom user ID is set. After the data is exported into BigQuery, this user ID is available as a column in the specified table and can be used to join the tables in a query.

  • Crashlytics

    Query to get the user ID:

    SELECT user.id FROM crashlytics.[my_crashlytics_table]
    

    To set user IDs so that they match other applications, use the Crashlytics SDK.

  • Google Analytics 4

    Query to get the user ID:

    SELECT user_id FROM google_analytics.events
    

    To learn how to include user IDs with the data that you send to Google Analytics, see these articles:

  • Zendesk

    Query to get the user ID:

    SELECT external_id FROM zendesk.users</code>
    

    Zendesk supports an external user ID that you can set for a user during a bulk user import or while using the API.

Map data sources to your CRM

Mapping the data sources back to the CRM tool may differ between business-to-business (B2B) and business-to-consumer (B2C) organizations. B2B companies need to map users to account IDs, and potentially map account IDs across each platform.

The following diagram illustrates an example of mapping data sources to a CRM tool, Salesforce:

map sources to CRM

  • Salesforce (B2C): To access the user ID from Salesforce, map the unified user ID to the Salesforce user ID. The external user ID is dependent on the custom field naming in Salesforce. You can use a custom field in the Salesforce Contact object that contains the unified user ID. Custom fields are populated in the Salesforce export as their own column.

  • Salesforce (B2B): To access the account ID from Salesforce, map the unified user ID to the Salesforce account ID. You can create a custom table in your own database, Firestore), that tracks your unified user ID in one column and Salesforce account ID in another, and replicate that table to BigQuery.

    Another option is to use custom variables in Google Analytics 4 and Crashlytics to include the Salesforce account ID:

    • To access Google Analytics 4 custom user properties, run the following:

      SELECT up.value.string_value FROM google_analytics.events, UNNEST(user_properties) as up WHERE up.key = 'sfdc_account_id'
      
    • To access custom keys in Crashlytics, run the following:

      SELECT ck.value FROM crashlytics.[my_crashlytics_table], UNNEST(custom_keys) as ck WHERE ck.key = 'sfdc_account_id'
      

You can set external Organization ID in Zendesk with the following:

SELECT external_id FROM zendesk.organization

Map issue IDs across data sources

To track the effect of issues on customer behavior and the path to resolution, you can join customer issues across Crashlytics, JIRA, and Zendesk. In this case, you can use the Crashlytics issue ID as the source-of-truth ID across all platforms.

You can get the Crashlytics issue ID with this query:

SELECT issue_id FROM [my_crashlytics_table]

map issue IDs across data sources

  • JIRA

    Query to get the Crashlytics issue ID:

    SELECT external_id FROM jira.issue
    

    Firebase provides direct integrations between JIRA and Crashlytics. See Add JIRA integrations to your project. This integration lets you create a new issue from the Crashlytics page or automatically generate JIRA issues based on a certain threshold. For information about how to use data across multiple sources for automatic JIRA creation with Looker actions, see Prioritizing bugs.

  • Zendesk

    Query to get the Zendesk ticket ID:

    SELECT ticket_id FROM zendesk.ticket
    

    When a support ticket is created, the team might create a bug or a feature request in JIRA on the customer's behalf. Zendesk has direct integrations with JIRA that make this process even easier for support agents. For more information, see JIRA App Integration with Zendesk Support. Also see Using the Zendesk Support for JIRA Integration, which lets you access this information in custom fields within JIRA. The column name in JIRA depends on the way you set up the integration and the specific connector you use.

Analyzing data in Looker

Connect BigQuery to Looker

With the data available in BigQuery, you can connect your project to Looker. See How to set up a connection in Looker to Google BigQuery.

Use Looker Blocks

The Looker marketplace has a variety of pre-built data models and dashboards specific to certain sources. Looker Blocks let you spend less time defining metrics and developing LookML, and more time asking deeper questions of the data.

Currently, there are blocks available for Salesforce, Zendesk, JIRA, and Crashlytics. There is also a Google Analytics 360 block. You can install these blocks from the Marketplace by navigating to the storefront symbol in your Looker instance.

Use the search bar to find specific blocks:

search for Looker Blocks

After you locate the blocks, you can install the model:

install the model

After the model is installed, you can extend and customize definitions to suit your business needs. In this document, you create a new project to use the Zendesk, Salesforce, JIRA, and Crashlytics blocks that are currently available in the marketplace.

You can find the sample LookML for a SaaS (B2B) company for this new project in the Unified Application Analytics GitHub repository. If you're just getting started with LookML, see Getting Started with LookML before jumping into the code.

The sample code, uses local project import to bring in the LookML that is needed from each block.

In the model file, you can specify which files you want to include. For example:

include: "//marketplace_crashlytics/*.model"

Another option is to directly copy and paste the LookML that you want to use from the blocks into your own project (as opposed to importing it) to simplify future development.

Use the identifiers discussed in Unifying data sources to create Explores that join across the data sources. This does not create one wide table; instead, Looker uses this semantic model to execute the necessary joins at query runtime. The following example creates the relationships between Crashlytics, JIRA, and Salesforce data:

# Crash events, combined with JIRA issues and Salesforce accounts
explore: crashlytics {
  fields: [ALL_FIELDS*, -issue.needs_triage, -issue.is_approaching_sla,]
  join: user_details {
    type: inner
    relationship: many_to_one
    sql_on: ${user_details.data__user_id} = ${crashlytics.user__id} ;;
  }
  join: account {
    type: inner
    relationship: many_to_many
    sql_on: ${account.id} = ${user_details.data__account_id} ;;
  }
  join: issue {
    type: left_outer
    relationship: many_to_one
    sql_on: ${issue.external_id} = ${crashlytics.issue_id} ;;
  }
}

With the Explores in place, end users have access to a self-service analytics environment that supports cross-platform questions. You can also import and copy the LookML dashboards provided in the blocks to develop more effective dashboards using metrics that combine information from each data source.

LookML dashboards

Automate operational workflows

To incorporate data into operational activities and to automate repetitive tasks, use Looker actions to send data to other applications. To learn more about Looker actions and how to enable Looker actions, see Admin settings - Actions.

This document describes how to use Looker actions to do the following:

With data sources centralized in BigQuery, you can analyze information across different use cases. See Reporting, Prioritizing bugs, and Managing accounts for examples of metrics that you can calculate in BigQuery, and workflows that can be automated inside Looker.

Reporting

Leadership teams need accurate, up-to-date views of their business. Some key metrics that may be relevant for executives include revenue targets to goals and average daily active users.

Visualize tracking towards goals

By comparing revenue targets stored in Google Sheets, stored in the salesforce.goals table, with customer information in the CRM, leads can visualize how they are tracking towards goals.

# Total Bookings Goal for Current Quarter plus Closed Revenue
SELECT
goals._Total_Bookings_ as goal,
SUM(CASE WHEN opportunity.stage_name = 'Closed Won' THEN opportunity.amount  ELSE NULL END) AS total_closed_won_amount,
FROM salesforce.opportunity  AS opportunity
LEFT JOIN salesforce.goals AS goals
ON CONCAT('Q' , EXTRACT(QUARTER FROM opportunity.close_date)) = goals.Quarter
AND EXTRACT(YEAR FROM opportunity.close_date) = goals.Year
WHERE DATE_TRUNC(opportunity.close_date, QUARTER) = DATE_TRUNC(CURRENT_DATE(), QUARTER)
GROUP BY 1

In LookML, you can use Liquid variables to dynamically pull in the correct goal for a specific team, region, or time period based on the filters applied by the user. For more details on how to use Liquid, see Liquid Variable Reference. For an example of using Liquid, see this demo view file, which is modeled on this example Google Sheet.

visualize tracking towards goals

Calculate average daily active users

The average daily active user metric represents the average number of users that logged on each day for each account, based on Google Analytics event logs that are joined onto a user-account mapping table from Firestore, and the account details from Salesforce.

    # Average number of active users each day for each account
    WITH account_facts AS
    (SELECT
        events.event_date AS event_date,
        account.id  AS account_id,
        COUNT(DISTINCT events.user_id ) AS number_of_users
    FROM google_analytics.events AS events
    LEFT JOIN firestore.user_details AS user_details ON events.user_id=user_details.data.user_id
    FULL OUTER JOIN salesforce.account  AS account ON account.id = user_details.data.account_id
    GROUP BY 1, 2)

    SELECT
        account_facts.event_date AS account_facts_event_date,
        AVG(account_facts.number_of_users) AS account_facts_average_number_users
    FROM account_facts
    GROUP BY 1

Operationalize workflows

Beyond visualizing metrics in a dashboard, you can also operationalize these workflows. For example, you might create an alert to send a notification when a team reaches its goals.

operationalize workflows

Another example is to use information from the metrics to populate a Google Slide deck for company-wide meetings. To do this, schedule your report to Google Sheets.

populate Google Slides

Next, in Google Slides you can create a chart based on data in Google Sheets.

Prioritizing bugs

To prioritize, resolve, and minimize customer disruptions, technical program managers, product managers, and developers need to measure the impact of problems in their applications. Metrics that help to address this use case include crashes experienced by high-value customers and issues raised in chat support.

Filter crashes experienced by high value customers

Using CRM data, you can filter issues from Crashlytics that affect customers with a value over a certain threshold.

# Issues that resulted in a fatal crash for accounts worth more than $5000
SELECT
    DISTINCT crashlytics.issue_id
FROM crashlytics.[my_table] AS crashlytics
INNER JOIN firestore.user_details  AS user_details ON user_details.data.user_id = crashlytics.user.id
INNER JOIN salesforce.account  AS account ON account.id = user_details.data.account_id
WHERE (select sum(amount) from salesforce.opportunity where account_id = account.id) > 5000 and crashlytics.is_fatal

Filter issues raised in chat support

By combining support logs data alongside Crashlytics, you can filter issues that are raised by customers in chat support.

# Issues that are linked to at least one Zendesk ticket
SELECT crash.issue_id, count(distinct zen.id) as num_tickets
FROM crashlytics.[my_table] as crash
INNER JOIN jira.issue as jira on jira.external_issue_id = crash.issue_id
INNER JOIN zendesk.ticket as zen on zen.id in unnest(jira.c__zendesk_ticket_ids)
GROUP BY 1
HAVING COUNT(distinct zen.id) > 0
ORDER BY 2 DESC

With this information, end users can set up notifications for new bugs that meet specific criteria, such as when a bug identified in a new release also affects a high-value customer.

Aside from getting a notification about these issues, end users can also use the JIRA Looker action to automatically create bugs with the information provided by the fields pulled into the query.

create bugs from JIRA

Managing accounts

To improve retention and take advantage of opportunities to upsell, sales representatives and account managers need to assess customer health and prioritize which accounts need attention. Metrics that may be relevant to these personas include customers affected by crashes and usage trends for accounts.

Identify customers affected by crashes

By joining CRM data and Crashlytics information, end users can filter crash information to look only at their own accounts.

#Get account information for customers that experience a fatal or non-fatal crash
SELECT
    account.id  AS account_id,
    account.name  AS account_name
FROM crashlytics.[my_table] AS crashlytics
INNER JOIN firestore.user_details  AS user_details ON user_details.data.user_id = crashlytics.user.id
INNER JOIN salesforce.account  AS account ON account.id = user_details.data.account_id
LEFT JOIN salesforce.user  AS account_owner ON account.owner_id = account_owner.id
WHERE (account_owner.name ) = 'myname'
GROUP BY
    1,
    2

Identify usage trends for accounts

Identifying usage trends for your accounts lets you focus on customers who experience an average week-over-week decline in the number of active users based on Google Analytics data.

# Calculates the average week-over-week change in event counts for each account over the past 4 weeks
WITH week_over_week as (
  SELECT
  account_id,
  (events_count - LAG(events_count,1) OVER (PARTITION BY account_id ORDER BY event_week DESC)
    )/LAG(events_count,1) OVER (PARTITION BY account_id ORDER BY event_week DESC) AS wow_change
  FROM
  (
    SELECT
      account.id  AS account_id,
      DATE_TRUNC(PARSE_DATE('%Y%m%d', events.event_date), WEEK(MONDAY)) AS event_week,
      COUNT(*) AS events_count
  FROM google_analytics.events AS events
  LEFT JOIN firestore.user_details  AS user_details ON events.user_id=user_details.data.user_id
  FULL OUTER JOIN salesforce.account  AS account ON account.id = user_details.data.account_id
  WHERE PARSE_DATE('%Y%m%d', events.event_date) >= (current_date() - 4*7)
  GROUP BY
      1,
      2
  )
)
SELECT account_id, AVG(wow_change) avg_wow
FROM week_over_week
GROUP BY 1
ORDER BY 2 ASC

Using the Slack integration, you can get weekly updates on your customers and take immediate action based on that information. In the Looker application, you can create custom field-level actions that send a POST request to a specified endpoint; this lets you directly update fields in Salesforce or reach out to your customers from the Looker application.

create custom field-level actions in Looker

What's next