Importing Google Analytics for Firebase Data into BigQuery

This tutorial describes how to export event logs from Google Analytics for Firebase into Google BigQuery in order to do in-depth analysis of the data.

Firebase is a platform for building mobile apps that includes features such as data and file storage, realtime synchronization, authentication, and more. Google Analytics for Firebase tracks usage of your Firebase features and provides insight on app usage and user engagement.

BigQuery is a petabyte-scale analytics data warehouse that you can use to run SQL-like queries over vast amounts of data in near realtime.

Importing your Google Analytics data into BigQuery provides several advantages:

  • Custom event parameters—In Google Analytics, you can define custom events and parameters to track. By importing your data into BigQuery, you gain the ability to do rich analysis over both standard and custom event parameters.

  • Aggregate data from multiple sources—If you track user events in multiple places, such as Google Analytics 360, Google Analytics for Firebase, and custom analytics collected by a mobile backend service, you can import the data from all of these sources into BigQuery to perform analysis that provides a complete picture of your event data.

  • Access to raw data—The Google Analytics dashboard provides an aggregate view of your event data. If you import your data into BigQuery, you can access the raw data values and have more flexibility in the way you analyze the events. You have access to all 25 event parameters and user properties.

  • Access control—When you export data to BigQuery, you own that data, and you can use BigQuery ACLs to manage permissions on projects and datasets, making it possible to share data without providing access to your Firebase app.

  • Archiving cold data—If you need to retain raw analytics data, you can pull the data into BigQuery, and then archive it into Google Cloud Storage or another storage location for inexpensive long-term storage.

Objectives

This tutorial describes how to:

  • Link Google Analytics for Firebase to BigQuery. This causes Analytics to export event logs to BigQuery on a daily basis.
  • Add code to an iOS or Android app that sends event logs to Analytics.
  • Query Analytics data stored in BigQuery.

Costs

To transmit Analytics data to BigQuery, you must upgrade your Firebase project to the Blaze level of service. For more information, see Firebase pricing.

In addition, storing and querying data in BigQuery incurs a fee. For more information, see BigQuery pricing.

Before you begin

  1. Create a Firebase project.

  2. In the Cloud Platform console, select your Firebase project. Enable the BigQuery API.

    Enable the API

  3. Build an app that sends events to Google Analytics. Sample apps are available for iOS, Android, and C++.

Linking Google Analytics data to BigQuery

Linking your Firebase app to BigQuery allows you to access your raw, unsampled event data along with all of your parameters and user properties. After you link a Firebase app to a BigQuery project, Google Analytics exports your event data to a corresponding BigQuery dataset on a daily basis.

Additionally, a table is imported for events received throughout the current day. This table is named app_events_intraday_YYYYMMDD. Some fields are not populated in intra-day tables, for more information see BigQuery Export schema.

You can query, export, or join your dataset with data from external sources to perform custom analysis.

Linking your Firebase project to BigQuery

To link your Firebase project to BigQuery, do the following:

  1. Open the Firebase console.

  2. Click the settings gear and select Project settings.

  3. Select the Account linking tab.

  4. On the BigQuery card, select Link to BigQuery.

  5. Choose whether to include Advertising Identifiers as part of your BigQuery export. Doing so provides richer data but will incur additional storage and query costs in BigQuery.

  6. After a project is linked to BigQuery, a corresponding dataset will be created in the associated BigQuery project upon the first daily export of events. Each day, raw event data for each linked app populates a new table in the associated dataset.

Managing which apps send data to BigQuery

By default, when you link your Firebase project to BigQuery, all apps in that project are linked to BigQuery as well as any apps that you add to the project later.

To configure which apps send data to BigQuery, do the following:

  1. Open the Firebase console.

  2. Click the settings gear and select Project settings.

  3. Select the Account linking tab.

  4. On the BigQuery card, click Manage linking.

  5. To halt an app from sending data to BigQuery, unselect Send data to BigQuery.

  6. Confirm by clicking Suspend data.

Unlinking a Firebase project from BigQuery

To unlink your project from BigQuery:

  1. Open the Firebase console.

  2. Click the settings gear and select Project settings.

  3. Select the Account linking tab.

  4. On the BigQuery card, click Manage linking.

  5. Under Project details, click Unlink project.

  6. Confirm by clicking Suspend data.

Querying Google Analytics for Firebase data using BigQuery

After your data is loaded into BigQuery, you can run SQL-like queries over it using BigQuery. For more information, see Querying Data in the BigQuery documentation.

BigQuery provides a web UI that makes running queries against your data as simple as filling out a form and clicking Run query. For more information, see Quickstart Using the Web UI in the BigQuery documentation.

The following are examples of queries you can run on your Analytics data. These queries generate reports that are not available in the Google Analytics dashboard.

Querying a custom event parameter

In Google Analytics, you can set custom parameter keys and values on events. The keys are typed as String, and values can have the type String, Long, or Double.

The following queries demonstrate how to generate reports that aggregate custom event values to provide business insights.

Example 1:

A mobile game developer wants to know which level of her game triggers the most in-game purchase flows. To do so, she adds the following custom event call to the game that runs when a player triggers an in-game purchase flow.

In this example, the name of the custom event is trigger_purchase, which has a key, level, that specifies an integer value that indicates the level the player was at when they triggered a purchase.

iOS - Objective C

[FIRAnalytics logEventWithName:@"trigger_purchase"
                parameters:@{
                              @"level": 3,
}];

iOS - Swift

FIRAnalytics.logEventWithName("trigger_purchase", parameters: [
  "level": 3
])

Android - Java

Bundle params = new Bundle();
params.putInt("level", 3);
mFirebaseAnalytics.logEvent("trigger_purchase", params);

The resulting entries in BigQuery for this custom event are the following:

Field Value
event_dim.name trigger_purchase
event_dim.params.key level
event_dim.params.value.int_value 3

To determine the number of purchases per level, she writes a query that reports the distribution of the ‘trigger_in_app_purchase’ event count, grouped by the level the user is currently at. This query runs across the date range 1/1/2015 to 12/31/2016.

SELECT COUNT(event_dim.name) as NumberOfPurchases,
event_dim.params.value.int_value as level
FROM
TABLE_DATE_RANGE(com_game_example_ANDROID.app_events_, TIMESTAMP('2015-01-01'), TIMESTAMP('2016-12-31'))
WHERE event_dim.name = 'trigger_purchase'
GROUP BY level

Example 2:

A second developer working on the same game wants to determine which character classes defeat the most opponents. To do so, he adds the following custom event logging to the game when a player defeats an opponent.

In this example, the name of the custom event is defeat_opponent, and it has a key, character, that specifies a string value that specifies the opponent’s name.

iOS - Objective C

[FIRAnalytics logEventWithName:@"defeat_opponent"
                parameters:@{
                              @"character": "Hercules"
}];

iOS - Swift

FIRAnalytics.logEventWithName("defeat_opponent", parameters: [
  "character": "Hercules"
])

Android - Java

Bundle params = new Bundle();
params.putString("character", "Hercules");
mFirebaseAnalytics.logEvent("defeat_opponent", params);

The resulting entries in BigQuery for this custom event would be the following:

Field Value
event_dim.name defeat_opponent
event_dim.params.key character
event_dim.params.value.string_value Hercules

To find the most successful character class, he writes a query that counts the number of times a character class is involved in a defeat-opponent custom event.

SELECT COUNT(event_dim.name) as DefeatEvents,
event_dim.params.value.string_value as CharacterClass
FROM
TABLE_DATE_RANGE( com_game_example_ANDROID.app_events_, TIMESTAMP('2015-01-01'), TIMESTAMP('2016-12-31'))
WHERE event_dim.name = 'defeat_opponent'
AND event_dim.params.key = 'character'
GROUP BY CharacterClass
ORDER BY DefeatEvents desc

Querying user events

In addition to querying on custom events, you can generate reports that aggregate user event values to provide business insights.

Example 3:

An app developer wants to use Firebase Cloud Messaging to send re-engagement push notifications to users who haven't used the app recently. To build the contact list for the notifications, the developer uses the following query to find users from the past year who have not visited the app in the previous two weeks. This query returns both iOS and Android users, as well as the date the users initially opened the app.

In the query below, the user_dim.user_id field contains the user id that the developer set in the app by calling the setUserId API. After this value is set, Google Analytics persists it and includes it in all future rows associated with that user. The value is not added retroactively to past rows.

If the developer does not set the user_dim.user_id identifier, they can use the app_info.app_instance_id field, which contains the default identifier generated by Firebase, in place of user_dim.user_id. Note that Firebase generates a new app_info.app_instance_id each time the Firebase app is uninstalled and re-installed on the same device.

The data table exported to BigQuery for the IOS app is com_retail_example_IOS.app_events_[DATE] and the corresponding data table for Android is com_retail_example_ANDROID.app_events_[DATE], where [DATE] is the current date.

SELECT
  userId,
  DATE(MIN( firstOpenTime )) firstOpenTime
FROM (
  SELECT
    user_dim.user_id AS userId,
    user_dim.first_open_timestamp_micros AS firstOpenTime
  FROM (TABLE_DATE_RANGE([com_retail_example_IOS.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -1, 'YEAR'), CURRENT_TIMESTAMP())),
    (TABLE_DATE_RANGE([com_retail_example_ANDROID.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -1, 'YEAR'), CURRENT_TIMESTAMP())) )
WHERE
  userId NOT IN (
  SELECT
    user_dim.user_id AS userId
  FROM (TABLE_DATE_RANGE([com_retail_example_IOS.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -14, 'DAY'), CURRENT_TIMESTAMP())),
    (TABLE_DATE_RANGE([com_retail_example_ANDROID.app_events_],
    DATE_ADD(CURRENT_TIMESTAMP(), -14, 'DAY'), CURRENT_TIMESTAMP())) )
GROUP BY
  userId

Example 4:

An app developer wants to determine the countries with the most active users, across both iOS and Android platforms. To do so, they use the following query to list the number of users and events per country on a given date.

The inner query returns the user_id and country of a row and then does a count of all the events within that row. The query uses the WITHIN RECORD clause as event_dim in a RECORD and is the best way of handling aggregations for nested records.

The outer query makes use of EXACT_COUNT_DISTINCT of the userId to provide an accurate number for the total number of users. Alternatively, the developer could use COUNT(DISTINCT user_id), but this gives an estimate instead of an exact value. An estimate might be faster when there is a very large dataset and you are looking for trends over time, a case when accuracy might be less important than speed.

SELECT
  country,
  EXACT_COUNT_DISTINCT(user_id) AS users,
  SUM(noOfEvents) AS totalEvents
FROM (
  SELECT
    user_dim.app_info.app_instance_id AS user_id,
    user_dim.geo_info.country AS country,
    COUNT(event_dim.name) WITHIN RECORD noOfEvents
  FROM
    [com_retail_example_IOS.app_events_20160723],
    [com_retail_example_ANDROID.app_events_20160723]
    )
GROUP BY
  1
ORDER BY
  2 DESC

Understanding the Google Analytics schema in BigQuery

When you connect Google Analytics for Firebase to BigQuery, Firebase exports data to BigQuery on a daily basis. It does so in the following formats:

Datasets

Google Analytics for Firebase creates a new dataset in BigQuery for each Firebase app. The dataset name is formatted as: [app_name]_[PLATFORM], where app_name is either the bundle ID (iOS) or the package name (Android).

Google Analytics converts periods into underscores in the bundle ID or package name to match BigQuery naming conventions. Analytics specifies the platform name in capital letters to match Firebase naming conventions.

For example, the dataset corresponding to an iOS app with a bundle ID of “com.username.myapp”, would result in a dataset named com_username_myapp_IOS.

Tables

Each day, Google Analytics creates a new table in the BigQuery dataset corresponding to the app. The tables are named using the pattern app_events_YYYYMMDD and contain the events recorded for the specified day.

Rows

Each row within a table corresponds to a bundle of events. The size of the bundle is determined by the app, which sends the data in calls to the Google Analytics for Firebase SDK.

Columns

Google Analytics exports user events as columns, which includes custom data events. For a list of the standard columns exported to BigQuery, see BigQuery Export schema in the Firebase Help documentation.

Cleaning up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:

The simplest way to stop billing charges is to delete the project. Since the Firebase project and Cloud Platform project are one and the same, you can delete it using either the Firebase or the Cloud Platform console.

  1. In the Cloud Platform Console, go to the Projects page.

    Go to the Projects page

  2. In the project list, select the project you want to delete and click Delete project. After selecting the checkbox next to the project name, click
      Delete project
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

If you don't want to delete your Cloud Platform and Firebase project, you can reduce costs by deleting the dataset from BigQuery.

  1. Go to the BigQuery web UI.

    Go to the BigQuery web UI.

  2. In the navigation, hover on the dataset for your app.

  3. Click the down arrow icon down arrow image next to your dataset name in the navigation, then click Delete dataset.

  4. In the Delete dataset dialog box, confirm the delete command by typing the name of your dataset and clicking OK.

What's next

  • Learn more about writing queries for BigQueryQuerying Data in the BigQuery documentation explains how to run sychronous and asynchronous queries, create user-defined functions (UDFs), and more.

  • Explore BigQuery syntax — BigQuery uses a SQL-like syntax that is described in the Query Reference (legacy SQL).

Send feedback about...