Filter, format, and transform data with gcloud, Google Cloud's command line interface
Google Cloud Platform
The gcloud command line tool is your gateway to manage and interact with Google Cloud Platform. Being a command line tool, you're probably already thinking of using system tools like
cat|sed|awk|grep|cut to extract out all the info gcloud offers. In fact, gcloud itself offers a variety of options that will help you avoid having to use those commands. In this article, we describe a couple of options you can use to automatically parse and format the results. We’ll also show you a how to chain these commands together in a bash or powershell script to extract the embedded data.
We’re going to demonstrate three gcloud features which you can extend and combine in a variety of ways:
- filters to return a subset of the result
- format to change how that data is rendered
- projections to apply transforms or logic directly to the data returned
FormatLet's start off by formatting a simple command that you are already familiar with that lists the projects to which you have access:
gcloud projects list
Now let’s see the raw out output of this command by asking for the raw JSON format of the response:
gcloud projects list --format="json"
Seeing the raw JSON now lets us select the resources we're interested in and the formats we'd like. Let's display the same response in a formatted box sorted by
createdTime and only select certain properties to display:
gcloud projects list --format="table[box,title='My Project List'](createTime:sort=1,name,projectNumber,projectId:label=ProjectID,parent.id:label=Parent)"
Tip: you can derive the JSON path value for a property by using
Say you don't want a formatted box, just a table without a border with a simple display of the date property in the format year-month-day:
gcloud projects list --format="table(createTime.date('%Y-%m-%d'),name,projectNumber,projectId)"
Now let's do some more complex formatting. To see this, list out the Compute Engine zones and peek at the JSON:
gcloud compute zones list --format="json"
Note the selfLink. It's the fully qualified name that you'd like to parse.
gcloud can help here too by giving you functions to select the JSON value and then extract and parse it. Let’s grab the last part of the URL segment of selfLink by using the
gcloud compute zones list --format="value(selfLink.scope())"
Alternatively, you can extract the value using .basename():
gcloud compute zones list --format="value(selfLink.basename())"
Suppose you want to extract part of the selfLink starting from the /projects path:
gcloud compute zones list --format="value(selfLink.scope(projects))"
Some GCP objects have multi-valued resources and we often need to enumerate them. For example, consider listing out all scopes enabled for a given GCE instance:
gcloud compute instances list --format="json"
What we actually want to do here is flatten the multi-valued resources:
gcloud compute instances list --format="flattened(name,serviceAccounts.email,serviceAccounts.scopes.basename())"
Or flatten multi-values to a separate line per value:
gcloud compute instances list --filter=name:instance-1 --flatten="serviceAccounts.scopes" --format="csv(name,id,serviceAccounts.email,serviceAccounts.scopes.basename())"
Here is the same information in an easy-to-read, structured format:
gcloud compute instances list --filter=name:instance-1 --format="table[box,no-heading](name,id,serviceAccounts:format='table[box,no-heading](email,scopes:format=\"table[box,no-heading](.)\")')"
The final formatting example parses a multi-valued resource to display the service account keys with the service account for the following raw output:
gcloud beta iam service-accounts keys list --iam-account email@example.com --project mineral-minutia-820 --format="json"
.scope() to extract just the serviceAccount part, then grab the first '/' delimited part with
gcloud beta iam service-accounts keys list --iam-account firstname.lastname@example.org --project mineral-minutia-820 --format="table(name.scope(serviceAccounts).segment(0):label='service Account',name.scope(keys):label='keyID',validAfterTime)"
FiltersLet's talk about filters. Filters allow you to select only the resources to which you want to apply formatting.For example, suppose you labeled your resource (projects, VM's, etc.) with a specific name, and you want to list only those projects where the labels match specific values (e.g.
gcloud projects list --format="json" --filter="labels.env=test AND labels.version=alpha"
You can also apply projections on keys. In the example below, the filter is applied on the
createTime key after the date formatting is set:
gcloud projects list --format="table(projectNumber,projectId,createTime)" --filter="createTime.date('%Y-%m-%d', Z)='2016-05-11'"
Notice the filter selected above actually references a JSON structure (labels.
env=test).You can of course use that and combine it in any number of ways.
Projection transformsProjection transforms allow you to alter the value rendered directly. We already showed several examples above (e.g.,
.extract(), .scope(), .basename(), .segment()). To note, one interesting capability of transforms is that you can combine and chain them together with .map() and and apply them to multi-valued data.
For example, the following applies conditional projection to the
parent.id key such that if the
parent.id key exists, the output is "YES" and otherwise its "NO". This is a quick way to see which of your projects meets a specific criteria (in this case, is it part of the Organization Node)
gcloud projects list --format="table(projectId,parent.id.yesno(yes="YES", no=”NO”):label='Has Parent':sort=2)"
gcloud compute instances list --format="flattened(name,serviceAccounts.email,serviceAccounts.scopes.map().scope())"
ScriptsFinally, let's see how we can combine
gcloudcommands into a script that will help us easily extract embedded information. In the following example, we list all the keys associated with all your projects’ service accounts.To do this, we first need to enumerate all the projects, then for each project, get all of its service accounts.Finally, for each service account, we list all the keys created against it. This is basically a nested loop to iterate over:
As a bash script:
Or as Windows PowerShell:
You'll also often need to parse response fields into arrays for processing. The following example parses the service account information associated with an instance into an array for easy manipulation. Notice the
serviceAccounts.scope field is multi-valued within the csv and delimited by a semicolon since we defined "separator=;". That is, each response line from the gcloud command below will be in the form
name,id,email,scope_1;scope_2;scope_3. The script below essentially parses the response from example 12 above:
Hopefully, this has given you ideas for how to effectively filter and format
gcloud command output. You can apply these techniques and extend them to any
gcloud response — just look at the raw response, think about what you want to do, and then format away!