Jump to Content
Google Cloud

Filter, format, and transform data with gcloud, Google Cloud's command line interface

June 14, 2016
Salmaan Rashid

Google Cloud Platform

The gcloud command line tool is your gateway to manage and interact with Google Cloud Platform. Being a command line tool, you're probably already thinking of using system tools like cat|sed|awk|grep|cut to extract out all the info gcloud offers. In fact, gcloud itself offers a variety of options that will help you avoid having to use those commands. In this article, we describe a couple of options you can use to automatically parse and format the results. We’ll also show you a how to chain these commands together in a bash or powershell script to extract the embedded data.

We’re going to demonstrate three gcloud features which you can extend and combine in a variety of ways:

  • filters to return a subset of the result
  • format to change how that data is rendered
  • projections to apply transforms or logic directly to the data returned

Format

Let's start off by formatting a simple command that you are already familiar with that lists the projects to which you have access:

1. gcloud projects list

Loading...

Now let’s see the raw out output of this command by asking for the raw JSON format of the response:

2. gcloud projects list --format="json"

Loading...

Seeing the raw JSON now lets us select the resources we're interested in and the formats we'd like. Let's display the same response in a formatted box sorted by createdTime and only select certain properties to display:

3. gcloud projects list --format="table[box,title='My Project List'](createTime:sort=1,name,projectNumber,projectId:label=ProjectID,parent.id:label=Parent)"

Loading...

Tip: you can derive the JSON path value for a property by using --format='flattened' flag.

Say you don't want a formatted box, just a table without a border with a simple display of the date property in the format year-month-day:

4. gcloud projects list --format="table(createTime.date('%Y-%m-%d'),name,projectNumber,projectId)"

Loading...

Now let's do some more complex formatting. To see this, list out the Compute Engine zones and peek at the JSON:

5. gcloud compute zones list --format="json"

Loading...

Note the selfLink. It's the fully qualified name that you'd like to parse. gcloud can help here too by giving you functions to select the JSON value and then extract and parse it. Let’s grab the last part of the URL segment of selfLink by using the selfLink.scope() function:

6. gcloud compute zones list --format="value(selfLink.scope())"

Loading...

Alternatively, you can extract the value using .basename():

7. gcloud compute zones list --format="value(selfLink.basename())"

Loading...

Suppose you want to extract part of the selfLink starting from the /projects path:

8. gcloud compute zones list --format="value(selfLink.scope(projects))"

Loading...

Some GCP objects have multi-valued resources and we often need to enumerate them. For example, consider listing out all scopes enabled for a given GCE instance:

9. gcloud compute instances list --format="json"

Loading...

What we actually want to do here is flatten the multi-valued resources:

10. gcloud compute instances list --format="flattened(name,serviceAccounts[].email,serviceAccounts[].scopes[].basename())"

Loading...

Or flatten multi-values to a separate line per value:

11. gcloud compute instances list --filter=name:instance-1 --flatten="serviceAccounts[].scopes[]" --format="csv(name,id,serviceAccounts.email,serviceAccounts.scopes.basename())"

Loading...

Here is the same information in an easy-to-read, structured format:

12. gcloud compute instances list --filter=name:instance-1 --format="table[box,no-heading](name,id,serviceAccounts:format='table[box,no-heading](email,scopes:format=\"table[box,no-heading](.)\")')"

Loading...

The final formatting example parses a multi-valued resource to display the service account keys with the service account for the following raw output:

13. gcloud beta iam service-accounts keys list --iam-account svc-2-429@mineral-minutia-820.iam.gserviceaccount.com --project mineral-minutia-820 --format="json"

Loading...

So use .scope() to extract just the serviceAccount part, then grab the first '/' delimited part with segment(0):

14. gcloud beta iam service-accounts keys list --iam-account svc-2-429@mineral-minutia-820.iam.gserviceaccount.com --project mineral-minutia-820 --format="table(name.scope(serviceAccounts).segment(0):label='service Account',name.scope(keys):label='keyID',validAfterTime)"

https://storage.googleapis.com/gweb-cloudblog-publish/images/gcloud-command2ka9a.max-700x700.PNG
(click to enlarge)

Filters

Let's talk about filters. Filters allow you to select only the resources to which you want to apply formatting.For example, suppose you labeled your resource (projects, VM's, etc.) with a specific name, and you want to list only those projects where the labels match specific values (e.g. label.env='test' and label.version=alpha):

15. gcloud projects list --format="json" --filter="labels.env=test AND labels.version=alpha"

Loading...

You can also apply projections on keys. In the example below, the filter is applied on the createTime key after the date formatting is set:

16. gcloud projects list --format="table(projectNumber,projectId,createTime)" --filter="createTime.date('%Y-%m-%d', Z)='2016-05-11'"

Loading...

Notice the filter selected above actually references a JSON structure (labels.env=test).You can of course use that and combine it in any number of ways.

Projection transforms

Projection transforms allow you to alter the value rendered directly. We already showed several examples above (e.g., .extract(), .scope(), .basename(), .segment()). To note, one interesting capability of transforms is that you can combine and chain them together with .map() and and apply them to multi-valued data.

For example, the following applies conditional projection to the parent.id key such that if the parent.id key exists, the output is "YES" and otherwise its "NO". This is a quick way to see which of your projects meets a specific criteria (in this case, is it part of the Organization Node)

17. gcloud projects list --format="table(projectId,parent.id.yesno(yes="YES", no=”NO”):label='Has Parent':sort=2)"

Loading...

18. gcloud compute instances list --format="flattened(name,serviceAccounts[].email,serviceAccounts[].scopes.map().scope())"

Loading...

Scripts

Finally, let's see how we can combine gcloud commands into a script that will help us easily extract embedded information. In the following example, we list all the keys associated with all your projects’ service accounts.To do this, we first need to enumerate all the projects, then for each project, get all of its service accounts.Finally, for each service account, we list all the keys created against it. This is basically a nested loop to iterate over:

As a bash script:

Loading...

Or as Windows PowerShell:

Loading...

You'll also often need to parse response fields into arrays for processing. The following example parses the service account information associated with an instance into an array for easy manipulation. Notice the serviceAccounts[].scope field is multi-valued within the csv and delimited by a semicolon since we defined "separator=;". That is, each response line from the gcloud command below will be in the form name,id,email,scope_1;scope_2;scope_3. The script below essentially parses the response from example 12 above:

Loading...

Hopefully, this has given you ideas for how to effectively filter and format gcloud command output. You can apply these techniques and extend them to any gcloud response — just look at the raw response, think about what you want to do, and then format away!

Posted in