Sanitize Column Names

If needed, you can clean the names of the columns in your dataset. When column names are sanitized:

  • alphanumeric characters and underscores (_) are permitted
  • spacebars are converted to underscores
  • all other characters are removed

Although Cloud Dataprep by TRIFACTA® INC. supports a wider range of characters, you may wish to sanitize your column names to simplify publishing to and import into downstream systems.

Sanitize during Import

The above sanitization can be applied to your column names when the dataset is imported.

Tip: If you notice issues with references to your column names in your recipes, you may be able to fix them by re-importing the dataset and choosing to sanitize during import.

Steps:

  1. In the left nav bar, click the Datasets icon.
  2. Click Import Data.
  3. Select the file or table to import.
  4. Click Edit Settings.
  5. In the dialog, select Remove special characters from column names.
  6. Complete the import of the dataset.

For more information, see Import Data Page.

Sanitize via Transformation

Through the Transform Builder, you can add a step to sanitize column names in your recipe.

Transformation Name Rename columns by removing special characters
Parameter: Option Clean current column names

Tip: If you are sanitizing your column names for downstream systems, you should add this step at the end of your recipe.

You can perform more fine-grained column renaming operations. See Rename Columns.

Var denne siden nyttig? Si fra hva du synes:

Send tilbakemelding om ...

Google Cloud Dataprep Documentation
Trenger du hjelp? Gå til brukerstøttesiden vår.