Building a Scalable Geolocation Telemetry System using the Maps API

In this solution, you learn how to use the Google Maps Platform to add location-based context to telemetry data. This solution describes the architectural design and considerations implemented in a sample app. The sample app collects street traffic data captured from freeways around San Diego, California, in the United States, and then shows traffic density heat maps superimposed on a Google map.

To run and understand the sample code, see the tutorial.

Understanding telemetry

Telemetry is the process of automatically collecting periodic measurements from remote devices. For example, if you’re building a fitness-tracking or ride-sharing app, you probably collect geolocation and other sensor data from a large number of moving objects, such as people or vehicles.

Telemetry and telematics data is usually sent as a periodic report of GPS position, plus some additional sensor data, such as velocity and time. The combined, reported data is provided in a format called a sentence. The sentence usually conforms to a standard, such as one of the many NMEA standards.

For vehicles, the sensor data could include accelerometer data, from which speed, heading, and maneuvers such as cornering and braking can be derived. The data can also include vehicle-dashboard information, if an onboard device supports it. Here’s an example of a sentence for a single data-point from a moving vehicle:

Timestamp    Latitude      Longitude     Speed   Bearing
97197600000  51.2345678    -0.1234556    34      261

Daily and seasonal variation can affect the rate at which this information comes in, which can make it hard to know how much capacity to plan for. Google Cloud can help you get up and running, no matter what scale you think you might need in the future.

Deriving meaning from the data

Knowing specifics about a location can add useful context. For example, knowing the speed of a car is useful, but the risk associated with a particular speed changes depending on whether the car is on a residential road or a highway.

After the data is received, usually some processing is required to understand what is happening in the physical realm. The system might respond by triggering an alert, or the next step might be to enrich the data before storing it for other apps to access. For example, the data could be summarized in a reporting dashboard or it could be stored in a database for future use.

Specific uses for such data can include:

  • Looking at a sequence of data points to infer trends, such as sudden increases in vehicle acceleration or sharp cornering.

  • Reverse-geocoding, which entails translating latitude and longitude coordinates to identify a specific region, town, street, address, and postal code for each location.

  • Determining the time zone and converting Coordinated Universal Time (UTC) to local time.

  • Correcting GPS positions to align to a road location.

  • Raising an alert when moving objects enter or leave areas of interest, called geofencing.

  • Aggregating data to calculate cumulative stats by region, such as average taxi fare by zip code.

  • Analyzing the behaviour of individual drivers. For example, you might identify risky maneuvers such as sharp braking, sudden cornering, or unreported collisions.

Solution pipeline

You can process telemetric data by using a pipeline, which is a series of processing phases. Google Cloud provides various services that you can use for each of these processing stages. The following table lists services according to pipeline stage. These stages are discussed in more detail in an upcoming section.

Processing stage
Google Cloud products
Data input
Cloud Storage import or streaming transfers
Cloud Logging
Cloud Storage
Cloud Bigtable
Visualization and analysis
Cloud SQL

This solution uses one combination of these services to build a pipeline that captures, processes, and analyzes geolocation data by using just a few lines of Python code.

The following diagram shows the organization of the sample pipeline:

Data pipeline starts with traffic data and ends with BigQuery

Data input

Data input is the process that imports data from a telemetric device into the system. This phase provides network transport, security, and the routing mechanism to transport the data to other parts of the system where the data can be re-routed, formatted, or sent to storage. This step acts as the system's shock absorber, which creates a buffer for unexpected bursts of incoming data or delays caused by issues downstream of the input source.

It's important that the data input process isn't strongly coupled to rest of the data pipeline. A decoupled approach enables you to build a more- robust app. For example, if one component of the data input process fails, or the pipeline needs to buffer data because the incoming data rate grows unexpectedly large, the app can continue to work. This approach also enables you to upgrade components in isolation and make other changes to the system without having to shut down the whole pipeline.

The first part of this solution's sample pipeline imports traffic data by using Pub/Sub. Pub/Sub delivers real-time and reliable messaging in one global, managed service that helps you create simple, reliable, and flexible apps. By providing many-to-many, asynchronous messaging that decouples senders and receivers, Pub/Sub enables secure and highly available communication between independently written apps. With Pub/Sub, you can push your events to another webhook, or pull them as they happen.

For an overview of how Pub/Sub works, see What is Pub/Sub.

In the sample app, the Python script that simulates sending messages from a vehicle is the publisher. This script reads vehicle journey data from a set of CSV files, adds a vehicle ID, geocodes the files, and then pushes the data to a Pub/Sub topic.


The processing phase is the step where you transform the data from its unprocessed state to a format that's appropriate for storage. Data in this stage is moved, filtered, shaped, enriched, and transformed. Incoming data can be processed in one of two modes:

  • Streaming. Data is processed virtually in real time, as it becomes available.
  • Batch. Data is collected over a period of time and aggregated into chunks that are processed together.

In the sample, the Python script that simulates receiving messages is the consumer. It processes the data as a stream, not a batch. This script pulls the pending messages from its Pub/Sub subscription and acknowledges each one to the Pub/Sub service. The script then processes the data. It reverse-geocodes the data to convert latitude and longitude to a street address, calculates the elevation above sea level, and computes the local time offset from UTC by using the timezone for each location.

Adding local context by using the Google Maps Platform

The Google Maps Platform web services provide a number of ways to add local context to data that has a geographic position attached. For example you can get the street address of a vehicle location from the latitude, longitude position by using the Geocoding API. Geocoding can help you understand where vehicles are, or run queries that relate to real world geography or administrative regions such as cities and zip codes.

Similarly, you can use the Timezone API to convert the UTC timestamps into local time. You can use the Elevation API to get the height above sea level, which can be useful to get an idea of fuel efficiency or to explain changes in speed.

You can integrate the Google Maps Platform into your solution by using a client library. The tutorial provided with this solution uses the Python Client Library for the Google Maps Platform web services. To use the Python client, you first import the library:

import googlemaps

To create a Google Maps Platform client, instantiate an object by using an API key, or a client ID and client secret if using Google Maps. The following example shows the syntax for creating a client object named gmaps:

gmaps = googlemaps.Client(key=[YOUR_API_KEY])

To convert a latitude and longitude into an address, call the reverse_geocode method on the client object. The methods timezone and elevation are also straightforward:

address_list = gmaps.reverse_geocode((latitude, longitude))

timezone = gmaps.timezone((latitude, longitude))

elevation = gmaps.elevation((latitude, longitude))

Usage of the Google Maps Platform web services is limited by number of requests in a specified time period. The Python library automatically limits request rates. Pub/Sub helps by allowing capture of the unprocessed data to stream at the varying rates reported from vehicles in motion. By using a Pub/Sub pull subscription, your client object can call the Maps API services at a steady rate. This approach can help you stay within the appropriate queries-per-second (QPS) level prescribed by the usage limits.


This is the long-term persistence layer for the data. Its role includes the storage of unprocessed data, and processed data that can be used for subsequent analytics. The sample script loads the enhanced record into BigQuery for storage.

It’s important to understand the Google Maps Platform terms of service, which impose limits on storing the data obtained from the Maps API.

To make sure the latest and most accurate Google Maps Platform data is accessible for your users, periodically refresh the data retrieved through the API, rather than storing it permanently.

Analysis and visualization

The business actions on data are usually the result of one or more questions, or queries. The goal might be to provide reports in a web frontend, drive the UI of an app, provide input into further computations, or provide insight into the data.

This solution uses BigQuery for analysis and Google Maps to visualize the data.

To view data stored in BigQuery on a Google Map, you can run queries by using the BigQuery API and then use one of the many available methods for drawing on the map provided by the Google Maps JavaScript API. For example, you can use SQL to query for all the records contained by a visible area of the map. First, you retrieve the coordinates of the corners of the map and then, build a SQL query to select all the rows that have latitude and longitude values that fall inside boundaries of the visible area. The BigQuery reference has an example of such a query in the Advanced examples section.

You can use the Google API Client Library for JavaScript to send a query to BigQuery as an HTTP POST and then retrieve the results. You can then draw selected rows on a map in a number of ways. Drawing many individual markers might make the map unreadable, so it’s a good idea to consider ways to reduce the volume of results to a manageable level. For example, you can use a WHERE clause and a LIMIT statement to reduce the amount data the browser needs to download. To learn more about techniques for managing many markers on a Google Map, see Too Many Markers.

The following screenshot shows an example that renders results from BigQuery as a heat map that indicates traffic density.

Google map of San Diego displays a traffic heat map.

For details about how the sample solution uses the API to retrieve data from BigQuery and then renders the results on a map, see the tutorial article.


The complete contents of the tutorial, including instructions and source code, are available on GitHub. You can follow the tutorial steps in one of two ways:

  • Follow all the steps manually and run the Python code on your computer. See the tutorial article.
  • Follow the steps by using Cloud Shell and deploying the code by using Docker containers. See the readme on GitHub.

What's next

  • Explore reference architectures, diagrams, tutorials, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.