Importing Firebase Event Logs into BigQuery

This tutorial describes how to export event logs from Google Analytics for Firebase and Firebase Crashlytics into Google BigQuery in order to do in-depth analysis of their data.

Firebase is a platform for building mobile apps that includes features such as data and file storage, realtime synchronization, authentication, and more. Google Analytics for Firebase tracks usage of your Firebase features and provides insight on app usage and user engagement, while Firebase Crashlytics logs your app's crashes, giving you insight into the kind of issues your users experience and how often they occur.

BigQuery is a petabyte-scale data warehouse that you can use to run SQL-like queries over vast amounts of data in near realtime.

Importing your Firebase data into BigQuery provides several advantages:

  • Aggregate data from multiple sources—If you track user events in multiple places, such as Google Analytics for Firebase, Firebase Crashlytics, Google Analytics 360, and custom analytics collected by a mobile backend service, you can import the data from all of these sources into BigQuery to perform analysis that provides a complete picture of your event data.

  • Access to raw data—When you import your data into BigQuery, you can access the raw data values and have flexibility in the way you analyze events.

  • Access control—When you export data to BigQuery, you own that data, and you can use BigQuery ACLs to manage permissions on projects and datasets, making it possible to share data without providing access to your Firebase app.

  • Archiving cold data—If you need to retain raw data, you can pull the data into BigQuery, and then archive it into Google Cloud Storage or another storage location for inexpensive long-term storage.

  • Custom event parameters—In Google Analytics, you can define custom events and parameters to track. By importing your data into BigQuery, you gain the ability to do rich analysis over both standard and custom event parameters.

Objectives

This tutorial describes how to:

  • Link a Firebase project to BigQuery, which lets Firebase Analytics and Firebase Crashlytics export event logs to BigQuery on a daily basis.
  • Query data stored in BigQuery.

Costs

To transmit Firebase data to BigQuery, you must upgrade your project to the Blaze level of service. For more information, see Firebase pricing.

In addition, storing and querying data in BigQuery incurs a fee. For more information, see BigQuery pricing.

Before you begin

  1. Create a Firebase project.

  2. Build an app that sends Firebase events. To learn how, read Get started with Firebase Crashlytics or Get Started with Google Analytics for Firebase for iOS or Android.

Linking project data to BigQuery

Linking your Firebase project to BigQuery allows you to access your raw, unsampled event data.

  1. Visit the Integrations page in the Firebase console.

  2. On the BigQuery card, click Link.

  3. Use the switches to determine whether you'd like to export Analytics data, Crashlytics data, or both. If exporting Analytics data, you can also opt to include advertising identifiers, which provide richer data but result in additional storage and query costs in BigQuery.

  4. Click Link to BigQuery to complete the task.

After you link your project, BigQuery creates a corresponding dataset and associates it with your project. Each day, your project adds a new table to the dataset for each linked app. The dataset should be available after the first daily export of your project's events.

Manage which apps send data to BigQuery

When you link your project to BigQuery, Firebase exports events for every app in your project. You can change that by opting apps out of the daily export:

  1. Visit the Integrations page in the Firebase console.

  2. On the BigQuery card, click Manage.

  3. To halt an app from sending data to BigQuery, click the switch next to that app's name, and select Stop Export.

  1. Visit the Integrations page in the Firebase console.

  2. On the BigQuery card, click Manage.

  3. Select Unlink Project at the bottom of the page, and confirm by clicking Unlink BigQuery.

Querying Firebase data with BigQuery

After your data is loaded into BigQuery, you can run SQL-like queries over it using BigQuery. For more information, see Querying Data in the BigQuery documentation.

BigQuery provides a web UI that makes running queries against your data as simple as filling out a form and clicking Run query. For more information, see Quickstart Using the Web UI in the BigQuery documentation.

Working with Firebase Analytics data in BigQuery

The following are examples of queries you can run on your Analytics data. These queries generate reports that are not available in the Google Analytics dashboard.

Querying a custom event parameter

In Google Analytics, you can set custom parameter keys and values on events. The keys are typed as String, and values can have the type String, Long, or Double.

The following queries demonstrate how to generate reports that aggregate custom event values to provide business insights.

Example 1:

A mobile game developer wants to know which level of her game triggers the most in-game purchase flows. To do so, she adds the following custom event call to the game that runs when a player triggers an in-game purchase flow.

In this example, the name of the custom event is trigger_purchase, which has a key, level, that specifies an integer value that indicates the level the player was at when they triggered a purchase.

iOS - Objective C

[FIRAnalytics logEventWithName:@"trigger_purchase"
                parameters:@{
                              @"level": 3,
}];

iOS - Swift

FIRAnalytics.logEventWithName("trigger_purchase", parameters: [
  "level": 3
])

Android - Java

Bundle params = new Bundle();
params.putInt("level", 3);
mFirebaseAnalytics.logEvent("trigger_purchase", params);

The resulting entries in BigQuery for this custom event are the following:

Field Value
event_dim.name trigger_purchase
event_dim.params.key level
event_dim.params.value.int_value 3

To determine the number of purchases per level, she writes a query that reports the distribution of the trigger_purchase event count, grouped by the level the user is currently at. This query runs across the date range 1/1/2015 to 12/31/2016.

SELECT COUNT(event_dim.name) as NumberOfPurchases,
event_dim.params.value.int_value as level
FROM
TABLE_DATE_RANGE(com_game_example_ANDROID.app_events_, TIMESTAMP('2015-01-01'), TIMESTAMP('2016-12-31'))
WHERE event_dim.name = 'trigger_purchase'
GROUP BY level

Example 2:

A second developer working on the same game wants to determine which character classes defeat the most opponents. To do so, he adds the following custom event logging to the game when a player defeats an opponent.

In this example, the name of the custom event is defeat_opponent, and it has a key, character, that specifies a string value that specifies the opponent’s name.

iOS - Objective C

[FIRAnalytics logEventWithName:@"defeat_opponent"
                parameters:@{
                              @"character": "Hercules"
}];

iOS - Swift

FIRAnalytics.logEventWithName("defeat_opponent", parameters: [
  "character": "Hercules"
])

Android - Java

Bundle params = new Bundle();
params.putString("character", "Hercules");
mFirebaseAnalytics.logEvent("defeat_opponent", params);

The resulting entries in BigQuery for this custom event would be the following:

Field Value
event_dim.name defeat_opponent
event_dim.params.key character
event_dim.params.value.string_value Hercules

To find the most successful character class, he writes a query that counts the number of times a character class is involved in a defeat-opponent custom event.

SELECT COUNT(event_dim.name) as DefeatEvents,
event_dim.params.value.string_value as CharacterClass
FROM
TABLE_DATE_RANGE( com_game_example_ANDROID.app_events_, TIMESTAMP('2015-01-01'), TIMESTAMP('2016-12-31'))
WHERE event_dim.name = 'defeat_opponent'
AND event_dim.params.key = 'character'
GROUP BY CharacterClass
ORDER BY DefeatEvents desc

Querying user events

In addition to querying on custom events, you can generate reports that aggregate user event values to provide business insights.

Example 3:

An app developer wants to use Firebase Cloud Messaging to send re-engagement push notifications to users who haven't used the app recently. To build the contact list for the notifications, the developer uses the following query to find users from the past year who have not visited the app in the previous two weeks. This query returns both iOS and Android users, as well as the date the users initially opened the app.

In the query below, the user_dim.user_id field contains the user id that the developer set in the app by calling the setUserId API. After this value is set, Google Analytics persists it and includes it in all future rows associated with that user. The value is not added retroactively to past rows.

If the developer does not set the user_dim.user_id identifier, they can use the app_info.app_instance_id field, which contains the default identifier generated by Firebase, in place of user_dim.user_id. Note that Firebase generates a new app_info.app_instance_id each time the Firebase app is uninstalled and re-installed on the same device.

The data table exported to BigQuery for the IOS app is com_retail_example_IOS.app_events_[DATE] and the corresponding data table for Android is com_retail_example_ANDROID.app_events_[DATE], where [DATE] is the current date.

SELECT
  userId,
  DATE(MIN( firstOpenTime )) firstOpenTime
FROM (
  SELECT
    user_dim.user_id AS userId,
    user_dim.first_open_timestamp_micros AS firstOpenTime
  FROM (TABLE_DATE_RANGE([com_retail_example_IOS.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -1, 'YEAR'), CURRENT_TIMESTAMP())),
    (TABLE_DATE_RANGE([com_retail_example_ANDROID.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -1, 'YEAR'), CURRENT_TIMESTAMP())) )
WHERE
  userId NOT IN (
  SELECT
    user_dim.user_id AS userId
  FROM (TABLE_DATE_RANGE([com_retail_example_IOS.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -14, 'DAY'), CURRENT_TIMESTAMP())),
    (TABLE_DATE_RANGE([com_retail_example_ANDROID.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -14, 'DAY'), CURRENT_TIMESTAMP())) )
GROUP BY
  userId

Example 4:

An app developer wants to determine the countries with the most active users, across both iOS and Android platforms. To do so, they use the following query to list the number of users and events per country on a given date.

The inner query returns the user_id and country of a row and then does a count of all the events within that row. The query uses the WITHIN RECORD clause as event_dim in a RECORD and is the best way of handling aggregations for nested records.

The outer query makes use of EXACT_COUNT_DISTINCT of the userId to provide an accurate number for the total number of users. Alternatively, the developer could use COUNT(DISTINCT user_id), but this gives an estimate instead of an exact value. An estimate might be faster when there is a very large dataset and you are looking for trends over time, a case when accuracy might be less important than speed.

SELECT
  country,
  EXACT_COUNT_DISTINCT(user_id) AS users,
  SUM(noOfEvents) AS totalEvents
FROM (
  SELECT
    user_dim.app_info.app_instance_id AS user_id,
    user_dim.geo_info.country AS country,
    COUNT(event_dim.name) WITHIN RECORD noOfEvents
  FROM
    [com_retail_example_IOS.app_events_20160723],
    [com_retail_example_ANDROID.app_events_20160723]
    )
GROUP BY
  1
ORDER BY
  2 DESC

Understanding the Google Analytics schema in BigQuery

When you connect Google Analytics for Firebase to BigQuery, Firebase exports data to BigQuery on a daily basis. It does so in the following formats:

Datasets

Google Analytics for Firebase creates a new dataset in BigQuery for each Firebase app. The dataset name is formatted as: [app_name]_[PLATFORM], where app_name is either the bundle ID (iOS) or the package name (Android).

Google Analytics converts periods into underscores in the bundle ID or package name to match BigQuery naming conventions. Analytics specifies the platform name in capital letters to match Firebase naming conventions.

For example, the dataset corresponding to an iOS app with a bundle ID of “com.username.myapp”, would result in a dataset named com_username_myapp_IOS.

Tables

Each day, Google Analytics creates a new table in the BigQuery dataset corresponding to the app. The tables are named using the pattern app_events_YYYYMMDD and contain the events recorded for the specified day.

Rows

Each row within a table corresponds to a bundle of events. The size of the bundle is determined by the app, which sends the data in calls to the Google Analytics for Firebase SDK.

Columns

Google Analytics exports user events as columns, which includes custom data events. For a list of the standard columns exported to BigQuery, see BigQuery Export schema in the Firebase Help documentation.

Working with Firebase Crashlytics data in BigQuery

The following examples demonstrate queries you can run on your Crashlytics data. These queries generate reports that aren't available in the Crashlytics dashboard.

Examples of Crashlytics queries

The following examples demonstrate how to generate reports that aggregate crash event data into more easily-understood summaries.

Example 1:

After working to squash as many bugs as they could, the lead developer at Friendly Pix thinks they're finally ready to launch their new photo-sharing app. Before they do, they want to check the number of crashes per day for the past month, to be sure their bug-bash made the app more stable over time:

#standardSQL
SELECT
  COUNT(DISTINCT event_id) AS number_of_crashes,
  FORMAT_TIMESTAMP("%F", event_timestamp) AS date_of_crashes
FROM
  `projectId.crashlytics.package_name_ANDROID`
GROUP BY
  date_of_crashes
ORDER BY
  date_of_crashes DESC
LIMIT
  30;

Example 2:

To properly prioritize production plans, a project manager at PineapplePlusPlus ponders how to point out the most pervasive crashes in their product. They produce a query that provides the pertinent points of data:

#standardSQL
SELECT
  DISTINCT issue_id,
  COUNT(DISTINCT event_id) AS number_of_crashes,
  COUNT(DISTINCT installation_uuid) AS number_of_impacted_user,
  blame_frame.file,
  blame_frame.line
FROM
  `projectId.crashlytics.package_name_ANDROID`
WHERE
  event_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(),INTERVAL 168 HOUR)
  AND event_timestamp < CURRENT_TIMESTAMP()
GROUP BY
  issue_id,
  blame_frame.file,
  blame_frame.line
ORDER BY
  number_of_crashes DESC
LIMIT
  10;

Example 3:

Fall is new phone season! A developer at Planned Obsolescence, Inc. knows that also means it's new device-specific issues season. To get ahead of the coming compatibility concerns, they put together a query that identifies the 10 devices that experienced the most crashes in the past week:

#standardSQL
SELECT
  device.model,
  COUNT(DISTINCT event_id) AS number_of_crashes
FROM
  `projectId.crashlytics.package_name_ANDROID`
WHERE
  event_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 168 HOUR)
  AND event_timestamp < CURRENT_TIMESTAMP()
GROUP BY
  device.model
ORDER BY
  number_of_crashes DESC
LIMIT
  10;

Example 4:

A game developer for Mecha Hamster 2: Hammy's Revenge wants to know which level of their game experiences the most crashes. To help them track that stat, they set a custom Crashlytics key current_level, and update it every time the user reaches a new level:

iOS - Objective C

[CrashlyticsKit setIntValue:3 forKey:@"current_level"];

iOS - Swift

Crashlytics.sharedInstance().setIntValue(3, forKey: "current_level")

Android - Java

Crashlytics.setInt("current_level", 3);

With that key in their BigQuery export, they then write a query to report the distribution of current_level values associated with each crash event:

#standardSQL
SELECT
  COUNT(DISTINCT event_id) AS num_of_crashes,
  value
FROM
  `projectId.crashlytics.package_name_ANDROID`,
  UNNEST(custom_keys)
WHERE
  key = "current_level"
GROUP BY
  key,
  value
ORDER BY
  num_of_crashes DESC

Example 5:

A developer with Mostly Okay Software has an app in early access. Most of their users love it, but a few have experienced an unusual amount of crashes. To get to the bottom of the problem, they write a query that pulls all the crash events for those users, using their user id:

#standardSQL
SELECT
  *
FROM
  `projectId.crashlytics.package_name_ANDROID`
WHERE
  user.id IN ("userid1",
    "userid2",
    "userid3")
ORDER BY
  user.id

Understanding the Firebase Crashlytics schema in BigQuery

When you link Crashlytics with BigQuery, Firebase exports recent fatal and non-fatal crash events, including events from up to two days before the link. From that point until you disable the link, Firebase exports Crashlytics events on a daily basis. It can take a few minutes for the data to be available in BigQuery after each export.

Datasets

Firebase Crashlytics creates a new dataset in BigQuery for Crashlytics data. The dataset covers your entire project, even if it has multiple apps, but does not include Analytics exports.

Tables

Firebase Crashlytics creates a table in the dataset for each app in your project, unless you've opted out of exporting data for that app. Firebase names the tables based on the app's bundle identifier, with periods converted to underscores, and a platform name appended to the end. For example data for an app with the ID com.google.test would be in a table named com_google_test_ANDROID.

Rows

Each row in a table represents an error the app encountered.

Columns

The columns in a table are identical for fatal and non-fatal errors. For a list columns exported to BigQuery, see Crashlytics BigQuery Export schema.

Visualizing exported Crashlytics data with Data Studio

Google Data Studio turns your Crashlytics datasets in BigQuery into reports that are easy to read, easy to share, and fully customizable.

To learn more about using Data Studio, try the Data Studio quick start guide, Welcome to Data Studio.

Using a Crashlytics report template

Data Studio has a sample report for Crashlytics that includes a comprehensive set of dimensions and metrics from exported Crashlytics BigQuery schema. You can use the sample as template to quickly create new reports and visualizations based on your own app’s raw crash data:

  1. Open the Crashlytics Data Studio Dashboard template.
  2. Click Use Template in the upper-right corner.
  3. In the Select a datasource dropdown, select Create New Data Source.
  4. Click Select on the BigQuery card.
  5. Select a table containing exported Crashlytics data by choosing My Projects > [your-project-name] > firebase_crashlytics > [your-table-name].
  6. Under Configuration, set Crashlytics Template level to Default.
  7. Click Connect to create the new data source.
  8. Click Add to Report to return to the Crashlytics template.
  9. Finally, click Create Report to create your copy of the Crashlytics Data Studio Dashboard template.

Cleaning up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:

The simplest way to stop billing charges is to delete the project. Since the Firebase project and Cloud Platform project are one and the same, you can delete it using either the Firebase or the Cloud Platform console.

  1. In the GCP Console, go to the Projects page.

    Go to the Projects page

  2. In the project list, select the project you want to delete and click Delete project. After selecting the checkbox next to the project name, click
      Delete project
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

If you don't want to delete your Cloud Platform and Firebase project, you can reduce costs by unlinking the project from BigQuery.

What's next

Was this page helpful? Let us know how we did:

Send feedback about...