Mask data

This page explains how to mask sensitive data when you prepare data in the Wrangler workspace of the Cloud Data Fusion Studio. You can mask data in columns of any data type, except boolean and bytes types.

  1. Go to Wrangler workspace in Cloud Data Fusion.
  2. On the Data tab, go to a column name and click the arrow_drop_down expander arrow.
  3. Select Mask, and select an option—for example, Custom selection. The options are explained in the following sections.

The transformation is applied to the Preview data shown on Data tab of the Wrangler workspace. Wrangler adds a masking directive to the recipe. When you run the data pipeline, the transformation is applied to all values in the column.

Show the last four characters only

The Show the last 4 characters only masking option adds the mask-number directive as a transformation step to the recipe.

Show the last two characters only

The show the last 2 characters only masking option adds the mask-number directive as a transformation step to the recipe.

Custom selection

The Custom selection masking option lets you select the position of the characters to mask in a cell and masks similarly positioned characters in each row of the column. For example, in a cell that contains a 10-character string, selecting the first 8 characters of the string causes the first 8 characters to be masked in each row of the column.

To select specific characters to mask:

  1. Go to Wrangler workspace in Cloud Data Fusion.
  2. On the Data tab, go to a column name and click the arrow_drop_down expander arrow.
  3. Select Mask > Custom selection. The column values that you can mask appear with a blue background.
  4. In any cell of the column, select the characters to mask.
  5. Click Apply.

That portion of the value is masked for all fields in the column.

Custom selection adds the mask-number directive to the recipe. When you run the data pipeline, the transformation is applied to all values in the column.

By shuffling

The By shuffling masking option applies a random masking pattern to each field in the column. Wrangler adds the mask-shuffle directive as a transformation step to the recipe.

What's next