EXAMPLE - Flatten and Valuestocols Transforms

This example shows how you can cross-reference columns of data using the following transforms:

  • flatten - Flatten values in an array into separate rows in the dataset. See Flatten Transform.
  • valuestocols - Extract unique instances of values into separate columns, with an indicator added to each row where the unique value is found. See Valuestocols Transform.


The following data covers magazine subscriptions for individual customers. Their subscriptions are stored in an array of values. You are interested in who is subscribing to each magazine.

Anne Aimes["Little House and Garden","Sporty Pants","Life on the Range"]
Barry Barnes["Sporty Pants","Investing for Tomorrow"]
Cindy Compton["Cakes and Pies","Powerlifting Plus","Running for Days"]
Darryl Diaz["Investing for Tomorrow","Cakes and Pies"]


When this data is loaded into the Transformer, you might need to apply a header to it. If it is in CSV format, you might need to apply some replace transforms to clean up the Subscriptions column so it looks like the above.

When the Subscriptions column contains cleanly formatted arrays, the column is re-typed as Array type. You can then apply the flatten transform:

flatten col:Subscriptions

Each CustId/Subscription combination is now written to a separate row. You can use this new data structure to break out instances of magazine subscriptions. Using the following transform, you can add the corresponding CustId value to the column:

valuestocols col:Subscriptions value:CustId

Drop the two source columns:

drop col:CustId,Subscriptions


Anne Aimes
Anne Aimes
Anne Aimes
Barry Barnes
Barry Barnes
Cindy Compton
Cindy Compton
Cindy Compton
Darryl Diaz
Darry Diaz

Send feedback about...

Google Cloud Dataprep Documentation