How to do cross-platform analytics with Google BigQuery
Learn how to analyze web (in Google Analytics 360) and mobile (in Firebase Analytics) data together using BigQuery.
It's a common requirement for developers to track user behavior and usage patterns across mobile and web platforms. For example, a developer may need to track users who sign up on a mobile app and complete a purchase on the web app. Another may need to simply track the number of web users that also use the mobile app. Yet another one may want to see the number of pageviews for comparable screens on both the web and mobile apps.
In the context of Google Cloud, Google Analytics 360 offers a comprehensive suite of web analytics products for larger businesses and enterprises, while Firebase Analytics, a no-cost tool, has been built ground-up for mobile app analytics and measurement. So, how do you combine them to do cross-platform analytics reporting like the examples we described? Let’s find out!
Google BigQuery, Google’s cloud-native data warehouse, facilitates some of this cross-platform tracking. In order to view the data in BigQuery:
In order to determine a unique user across both GA and Firebase in BigQuery, you need to set a custom user id value in both Google Analytics 360 and Firebase. However, user id based joins are only possible when the user logs in on all devices with the same user id (also sometimes known as customer ID or CRM ID) defined by your backend platform / database. For apps or websites that don't require a login and hence cannot set a custom user id in Google Analytics 360 and Firebase, it makes more sense to analyze aggregated metrics such as total screen views for screens on the web and the corresponding views in the mobile version.
User ID-based joinsThere are two options for storing a custom user ID in Google Analytics 360:
- As a session or user scoped custom dimension, which is stored in the customDimensions.index and customDimensions.value fields in BigQuery.
- As a User ID field in GA — if you use this option, you must link your User ID view to BigQuery and therefore will only have authenticated user data in BigQuery as you can only link one view per property to BigQuery.
Firebase’s custom user id value can be set using the setUserId API and is stored in the
user_dim.user_id field in BigQuery.
Once the user id is set in both Google Analytics 360 and Firebase, a number of queries can be performed to determine user behavior across both Google Analytics 360 (web) and Firebase (mobile).
In this post, we're using Standard SQL for all our queries. If you've previously used Legacy SQL for Google Analytics 360 or Firebase Analytics, please read our guide on migrating from Legacy SQL.
Example 1: Determine the total number of unique users in the past 30 days, across both the mobile apps (firebase IOS + Android) and the web app (Google Analytics)In this example, we make use of a
customDimensions.valuefield in Google Analytics 360 and the
user_dim.user_idfield in Firebase. In our example, we're using the custom dimension with index 12 in Google Analytics 360. You can get the index number from the admin section of Google Analytics, where you set up and manage custom dimensions. For your queries, replace it with the relevant index where you've stored the user ID.
In the following query, we're combining data from a single Google Analytics 360 property and two Firebase Apps (iOS and Android). We also have a
CASE WHEN statement to check if a user ID is set and if not, we make use of the default
fullVisitorId (which is a hashed version of the client ID) in Google Analytics 360 or the default
app_info.app_instance_id in Firebase analytics.
Example 2: Determine daily unique users in the past 30 days, broken down by date, across both the mobile apps (Firebase IOS + Android) and the web app (Google Analytics)To break this report down by date, we need to select the dates for Google Analytics 360 and Firebase data. Google Analytics 360 provides a date column for each session. For Firebase, we can extract this from the
event_dim.timestamp_microsfield or the event_dim.date field, which requires us to flatten the data using the UNNEST function.
event_dim.date field was added to the schema from November 2016 onwards, so you can use the event_dim.timestamp_micros as we've shown in the examples below, if you query data from before that date.
Standard SQL - Using WITH clauseWriting long queries can get quite complex, and so for Standard SQL queries, we'll use the
WITHclause, which allows us to create one or more named subqueries. We can then reference these named queries in our final
Example 3: Cross-device usage overlapThis query returns the total number of users on a given device group and the ratio of this number to the overall number of users across different device groups. The results are similar to the cross device overlap report in Google Analytics 360.
In this query, we use the user ID to determine which users have logged in on different devices and define a group of the devices on which they've logged in. For example, if a user logged in on the web, on the desktop and on their Android device, they'll fall into the group "web - desktop, app - android".
We can see the total users that fall into each device category group, and what percentage that device category group represents of the total users using count divided by the total SUM as a
WINDOW function in standard SQL and the
RATIO_TO_REPORT WINDOW function in Legacy SQL.
Example 4: Cross-device funnel reportsNext, let’s see how you might figure out what’s causing users to spend money in your app. In order to answer this question, we'll build a sequence of events per user that leads up to that user actually spending money in the app. In effect, we'll build a custom event funnel report.
Step 1: Apply LEAD function to create steps for all usersHere's a query that generates a set of sequences of three events. You can modify this query to include any number of events in the sequence. The
PARTITION BYclause breaks input rows into separate partitions, by app instance id in this case. The
LEADfunction returns the next event based on an ordering criteria, which is the event timestamp, thus creating a series of events ordered by timestamp. This is a
WINDOWfunction, which is a powerful way to analyze your data.
The results for the Step 1 of our query below show a set of three events for users. As noticed in the interim Result 1, the second event in each row appears as the first event in the next row and so on. User Ids have been masked in the results above.
Step 2: Calculate users for journey that ends with spending virtual currencyIn the second step of our query, we simply define our end event as the “spend_virtual_currency” event, since we would like to determine what users did before they spent money in the app. The query below calculates the number of users and the frequency with which this funnel has been navigated. Note how we apply results from Part 1 of our query to the second part of the query below.
In the results above, you can see all the different paths that users took before they spent money in your app. For instance, many users spend money after completing a round, which is great. On the other hand, it also shows that you, the developer, could do more in the app to entice users to spend money after "spinning slots" or "levelling up". The ability to construct closed funnels like the above and dynamically set the start and end events for a funnel offers a powerful way to make data-enabled decisions in your app.
Lets now see a more advanced version of the funnel query above and join with it with our Google Analytics 360 data to create a cross-device funnel in the next example.
Step 3: Create three views for web, iOS app and Android app eventsCreating views is a easy and useful way to simplify your queries and make SQL share reusable. (Read the tutorial on creating views.) Create a new data set in the same project to save your views in. Let’s call the new data set
To create this cross-device funnel, we require the event timestamp to be in the same format —but Google Analytics 360 and Firebase Analytics provide the event timestamp in different formats. In Google Analytics 360, the hit timestamp is calculated by combining the
hits.time values, while Firebase Analytics uses
To make the timestamps the same format, we'll convert the Google Analytics 360 timestamp into microseconds using the following logic. This gives us the timestamp in UTC.
Use the following queries to create a view for the web, iOS app and android app events.
Google Analytics 360 web events query
Save as view named - ga360_events_last_7days
Firebase Analytics Android app events query
Save as view named - fb_android_events_last_7days
Firebase Analytics iOS app events query
Save as view named - fb_ios_events_last_7days
Step 4: Union data and use LEAD function to create steps for all usersWe can now query the data from our views that we created treating them just like tables. In this query, we union all three views and then just as we did in Step 1, use the
LEADfunction to get all the steps.
Step 5: Aggregate to get the most popular cross device pathsAs we're now making use of the user_id we're able to discover the most popular cross-device paths. In this query, we'll filter to the query to only see people who have signed up to a special offer on our website and have then gone on to do something in the app. We do that by applying the
WHEREstatement earlier in the query to only use the event "currency special offer" from the web, but we want to include all app events.
Step 6: Calculate percentage going to each stepIn this final query, we can build on Step 5 to calculate how many users go to each step using the
SUM WINDOWfunction and then calculate the ratio from one step to the next.
Step 7: Remove the WHERE statementIf we remove the
WHEREstatement from Step 6, we no longer define the journeys that we want to analyze but rather can discover what the most popular journeys are, and where there might be issues due to users dropping off.
Give it a try and see which user journeys are the most popular for your app!