Guide to using Google BigQuery on Microsoft PowerShell
Alex Handley
Technical Intern
Google BigQuery is a versatile tool that solves the problem of storing and querying massive datasets without having to worry about data formats, underlying resources and other things that distract you from the analysis at hand. To help those BigQuery users who virtually live in Microsoft PowerShell cmdlets for BigQuery—allowing them to script BigQuery commands straight from the PowerShell environment that they know and love.
In this post, I’ll briefly demonstrate how to use these cmdlets. The demo is based on a public BigQuery dataset, so it’s easy to follow along. A more detailed version of this demo can be found here.
BigQuery setup
To get your environment set up, go to the Cloud Tools for PowerShell GitHub repository and follow the “Installation” steps. Then, set your project as the default project for powershell.The BigQuery dataset with historical information about cab rides is an interesting playground for data analysis. This table contains pickup and dropoff locations, timestamps, fare information and other assorted metadata from New York City yellow cab rides from 2009 or later. We'll use that data, combined with BigQuery, to identify the best places for drivers who want to maximize their profit.
The next few steps will be setting up some of the data structures that we need for this demo. First, make a new dataset in your project. Then create a schema and a table which gives you a place to put the selected data
Running queries
Now you can start the actual data analysis. Run a query that scores each trip based on the simple metric of total ride price divided by the distance traveled. The query should pull a number of additional fields and save them to a destination table. This step allows you to make subsequent queries into the curated subset of data with ease, reducing the amount of data processed from ~50GB to ~15MB per query.Run another query on the initial results table to list just latitude and longitude of top-200 scoring records
Export the query results as a CSV file to Google Cloud Storage and then download it. This can be done with the web UI or with the Cloud Storage PowerShell cmdlets.
Mapping and analysis
Now that you have a table with just the refined coordinate data, you can plot it on a custom Google Map. Create a new map and import your locations from the CSV file. Follow the dialogue box prompts on the site to import the file and select latitude and longitude columns. When asked for a column of names, just select either one because you don’t need the points to be named.There you have it! Scroll toward New York City to focus in the center of the mass of dots. Observe the locations plotted on the map, and identify clusters and holes. The green regions are clusters of high scoring trips, which means short and high-value rides. Drivers should stay in these areas to get the best trips and away from the red regions, which have a much smaller projected yield.
Next steps
When you’re finished with your analysis, clean up the BigQuery dataset and the Cloud Storage bucket to avoid unnecessary charges.I’ve just shown you how to use BigQuery within PowerShell, and there’s an entire universe of data that can be analyzed. To get started on your own analytics projects:
- Install the BigQuery cmdlets from the GitHub repo
- Review the cmdlet docs.
- For more on BigQuery itself, see the official product page



