Jump to Content
Google Cloud

How to do cross-platform analytics with Google BigQuery

April 13, 2017
Tahir Fayyaz

Sales Engineer

Arun Venkatesan

Developer Advocate

Learn how to analyze web (in Google Analytics 360) and mobile (in Firebase Analytics) data together using BigQuery.

It's a common requirement for developers to track user behavior and usage patterns across mobile and web platforms. For example, a developer may need to track users who sign up on a mobile app and complete a purchase on the web app. Another may need to simply track the number of web users that also use the mobile app. Yet another one may want to see the number of pageviews for comparable screens on both the web and mobile apps.

In the context of Google Cloud, Google Analytics 360 offers a comprehensive suite of web analytics products for larger businesses and enterprises, while Firebase Analytics, a no-cost tool, has been built ground-up for mobile app analytics and measurement. So, how do you combine them to do cross-platform analytics reporting like the examples we described? Let’s find out!

Google BigQuery, Google’s cloud-native data warehouse, facilitates some of this cross-platform tracking. In order to view the data in BigQuery:

  1. Google Analytics 360 needs to be linked to BigQuery.
  2. Firebase needs to be linked to BigQuery.
Once linked, BigQuery creates a dataset for each Google Analytics 360 view linked and for each Firebase app linked. These datasets contain tables that each contain a day’s worth of data.

In order to determine a unique user across both GA and Firebase in BigQuery, you need to set a custom user id value in both Google Analytics 360 and Firebase. However, user id based joins are only possible when the user logs in on all devices with the same user id (also sometimes known as customer ID or CRM ID) defined by your backend platform / database. For apps or websites that don't require a login and hence cannot set a custom user id in Google Analytics 360 and Firebase, it makes more sense to analyze aggregated metrics such as total screen views for screens on the web and the corresponding views in the mobile version.

User ID-based joins

There are two options for storing a custom user ID in Google Analytics 360:
  1. As a session or user scoped custom dimension, which is stored in the customDimensions.index and customDimensions.value fields in BigQuery.
  2. As a User ID field in GA — if you use this option, you must link your User ID view to BigQuery and therefore will only have authenticated user data in BigQuery as you can only link one view per property to BigQuery.
In our examples, we'll be making use of Option 1, using a custom dimension to store the User ID in Google Analytics 360.

Firebase’s custom user id value can be set using the setUserId API and is stored in the user_dim.user_id field in BigQuery.

Once the user id is set in both Google Analytics 360 and Firebase, a number of queries can be performed to determine user behavior across both Google Analytics 360 (web) and Firebase (mobile).

In this post, we're using Standard SQL for all our queries. If you've previously used Legacy SQL for Google Analytics 360 or Firebase Analytics, please read our guide on migrating from Legacy SQL.

Example 1: Determine the total number of unique users in the past 30 days, across both the mobile apps (firebase IOS + Android) and the web app (Google Analytics)

In this example, we make use of a customDimensions.index and customDimensions.value field in Google Analytics 360 and the user_dim.user_id field in Firebase. In our example, we're using the custom dimension with index 12 in Google Analytics 360. You can get the index number from the admin section of Google Analytics, where you set up and manage custom dimensions. For your queries, replace it with the relevant index where you've stored the user ID.

In the following query, we're combining data from a single Google Analytics 360 property and two Firebase Apps (iOS and Android). We also have a CASE WHEN statement to check if a user ID is set and if not, we make use of the default fullVisitorId (which is a hashed version of the client ID) in Google Analytics 360 or the default app_info.app_instance_id in Firebase analytics.

Query

Loading...

Result

https://storage.googleapis.com/gweb-cloudblog-publish/images/cross-platform-77xbq.max-900x900.PNG

Example 2: Determine daily unique users in the past 30 days, broken down by date, across both the mobile apps (Firebase IOS + Android) and the web app (Google Analytics)

To break this report down by date, we need to select the dates for Google Analytics 360 and Firebase data. Google Analytics 360 provides a date column for each session. For Firebase, we can extract this from the event_dim.timestamp_micros field or the event_dim.date field, which requires us to flatten the data using the UNNEST function.

Note: the event_dim.date field was added to the schema from November 2016 onwards, so you can use the event_dim.timestamp_micros as we've shown in the examples below, if you query data from before that date.

Standard SQL - Using WITH clause

Writing long queries can get quite complex, and so for Standard SQL queries, we'll use the WITH clause, which allows us to create one or more named subqueries. We can then reference these named queries in our final SELECT statement. 

Query

Loading...

Result

https://storage.googleapis.com/gweb-cloudblog-publish/images/cross-platform-2ghkq.max-900x900.PNG

Example 3: Cross-device usage overlap

This query returns the total number of users on a given device group and the ratio of this number to the overall number of users across different device groups. The results are similar to the cross device overlap report in Google Analytics 360.

In this query, we use the user ID to determine which users have logged in on different devices and define a group of the devices on which they've logged in. For example, if a user logged in on the web, on the desktop and on their Android device, they'll fall into the group "web - desktop, app - android".

We can see the total users that fall into each device category group, and what percentage that device category group represents of the total users using count divided by the total SUM as a WINDOW function in standard SQL and the RATIO_TO_REPORT WINDOW function in Legacy SQL.

Query

Loading...

Result

https://storage.googleapis.com/gweb-cloudblog-publish/images/cross-platform-5m9g9.max-900x900.PNG

Example 4: Cross-device funnel reports

Next, let’s see how you might figure out what’s causing users to spend money in your app. In order to answer this question, we'll build a sequence of events per user that leads up to that user actually spending money in the app. In effect, we'll build a custom event funnel report.

Step 1: Apply LEAD function to create steps for all users

Here's a query that generates a set of sequences of three events. You can modify this query to include any number of events in the sequence. The PARTITION BY clause breaks input rows into separate partitions, by app instance id in this case. The LEAD function returns the next event based on an ordering criteria, which is the event timestamp, thus creating a series of events ordered by timestamp. This is a WINDOW function, which is a powerful way to analyze your data. 

Query

Loading...

Result

https://storage.googleapis.com/gweb-cloudblog-publish/images/cross-platform-3nzes.max-1000x1000.PNG

The results for the Step 1 of our query below show a set of three events for users. As noticed in the interim Result 1, the second event in each row appears as the first event in the next row and so on. User Ids have been masked in the results above.

Step 2: Calculate users for journey that ends with spending virtual currency

In the second step of our query, we simply define our end event as the “spend_virtual_currency” event, since we would like to determine what users did before they spent money in the app. The query below calculates the number of users and the frequency with which this funnel has been navigated. Note how we apply results from Part 1 of our query to the second part of the query below.

Query

Loading...

Result

https://storage.googleapis.com/gweb-cloudblog-publish/images/cross-platform-1tj4l.max-1000x1000.PNG

In the results above, you can see all the different paths that users took before they spent money in your app. For instance, many users spend money after completing a round, which is great. On the other hand, it also shows that you, the developer, could do more in the app to entice users to spend money after "spinning slots" or "levelling up". The ability to construct closed funnels like the above and dynamically set the start and end events for a funnel offers a powerful way to make data-enabled decisions in your app.

Lets now see a more advanced version of the funnel query above and join with it with our Google Analytics 360 data to create a cross-device funnel in the next example.

Step 3: Create three views for web, iOS app and Android app events

Creating views is a easy and useful way to simplify your queries and make SQL share reusable. (Read the tutorial on creating views.) Create a new data set in the same project to save your views in. Let’s call the new data set cross_device_reporting.

To create this cross-device funnel, we require the event timestamp to be in the same format —​​but Google Analytics 360 and Firebase Analytics provide the event timestamp in different formats. In Google Analytics 360, the hit timestamp is calculated by combining the visitStartTime and hits.time values, while Firebase Analytics uses event_dim.timestamp_micros.

To make the timestamps the same format, we'll convert the Google Analytics 360 timestamp into microseconds using the following logic. This gives us the timestamp in UTC.

Loading...

Use the following queries to create a view for the web, iOS app and android app events.

Google Analytics 360 web events query

Save as view named - ga360_events_last_7days

Loading...

Firebase Analytics Android app events query

Save as view named - fb_android_events_last_7days

Loading...

Firebase Analytics iOS app events query

Save as view named - fb_ios_events_last_7days

Loading...

Step 4: Union data and use LEAD function to create steps for all users

We can now query the data from our views that we created treating them just like tables. In this query, we union all three views and then just as we did in Step 1, use the LEAD function to get all the steps.

Loading...

Step 5: Aggregate to get the most popular cross device paths

As we're now making use of the user_id we're able to discover the most popular cross-device paths. In this query, we'll filter to the query to only see people who have signed up to a special offer on our website and have then gone on to do something in the app. We do that by applying the WHERE statement earlier in the query to only use the event "currency special offer" from the web, but we want to include all app events.

Loading...

Result

https://storage.googleapis.com/gweb-cloudblog-publish/images/cross-platform-4zkmh.max-1000x1000.PNG

Step 6: Calculate percentage going to each step

In this final query, we can build on Step 5 to calculate how many users go to each step using the SUM WINDOW function and then calculate the ratio from one step to the next.

Loading...

Result

https://storage.googleapis.com/gweb-cloudblog-publish/images/cross-platform-6f2rr.max-1100x1100.PNG

Step 7: Remove the WHERE statement

If we remove the WHERE statement from Step 6, we no longer define the journeys that we want to analyze but rather can discover what the most popular journeys are, and where there might be issues due to users dropping off.

Give it a try and see which user journeys are the most popular for your app!

Next steps

You can try out some of the queries above on your own data set or on the Android and iOS sample data sets. Furthermore:

  • View the BigQuery cookbook for Google Analytics 360 and run the queries on the sample Google Analytics dataset.
  • Learn more on importing Firebase data into BigQuery and listen to the talk on ideas to analyze Firebase data on BigQuery.

Posted in