EXAMPLE - LIST Function

You have the following set of orders for two months, and you are interested in identifying the set of colors that have been sold for each product for each month and the total quantity of product sold for each month.

Source:

OrderIdDateItemQtyColor
10011/15/15Pants1red
10021/15/15Shirt2green
10031/15/15Hat3blue
10041/16/15Shirt4yellow
10051/16/15Hat5red
10061/20/15Pants6green
10071/15/15Hat7blue
10084/15/15Shirt8yellow
10094/15/15Shoes9brown
10104/16/15Pants1red
10114/16/15Hat2green
10124/16/15Shirt3blue
10134/20/15Shoes4black
10144/20/15Hat5blue
10154/20/15Pants6black

Transform:

To track by month, you need a column containing the month value extracted from the date:

set col:Date value:DATEFORMAT(Date, 'MMM yyyy')

You can aggregate the data in your dataset, grouped by the reformatted Date values, and apply the LIST function to the Color column. In the same aggregation, you can include a summation function for the Qty column:

pivot value: LIST(Color, 1000) SUM(Qty) group: Date limit:1

Results:

Datelist_Colorsum_Qty
Jan 2015["green","blue","blue","red","green","red","yellow"]28
Apr 2015["brown","blue","red","yellow","black","blue","black","green"] 38

If needed, you can unpack the list array data using the following:

unnest col:list_Color

Was this page helpful? Let us know how we did:

Send feedback about...

Google Cloud Dataprep Documentation
Need help? Visit our support page.