Header Transform

Uses one row from the dataset sample as the header row for the table. Each value in this row becomes the name of the column in which it is located. This transform might be automatically applied as one of the first steps of your recipe. See Initial Parsing Steps.

Basic Usage

header sourcerownumber: 4

Output: The values from Row #4 of the original dataset are used, if available, as the names for each column. If the row is not available, the specified row data can be retrieved, and the transform fails.

Parameters

header sourcerownumber: row_num

TokenRequired?Transform BuilderData TypeDescription
headerYAdd headertransformName of the transform
sourcerownumberNRow numberinteger (positive)Row number from the original data to use as the header. If not specified, the current row #1 is used.

For more information on syntax standards, see Language Documentation Syntax Notes.

sourcerownumber

The sourcerownumber parameter defines the row number to apply to the transform step.

This parameter references the original row number of the sample in the dataset.

  • sourceownumber parameter must be an integer that is less than or equal to the total number of rows in the original sample.
  • If the corresponding row has been deleted from the dataset, the transform step generates an error.

If no sourcerownumber parameter is specified, the first row in the dataset is used for the header values.

NOTE: In some cases, the source row number information might no longer be available. See SOURCEROWNUMBER Function.

Example:

header sourcerownumber: 4

Output: Uses row #4 from the source row numbers of the sample as the header the columns.

Usage Notes:

Required?Data Type
Nointeger (positive)

Examples

Example - Header from row that is not the first one

Source:

You have imported the following racer data on heat times from a CSV file. When loaded in the Transformer page, it looks like the following:

(rowId)column2column3column4column5
1RacerHeat 1Heat 2Heat 3
2Racer X37.2238.2237.61
3Racer Y41.33DQ38.04
4Racer Z39.2739.0438.85

In the above, the (rowId) column references the row numbers displayed in the data grid; it is not part of the dataset. This information is available when you hover over the black dot on the left side of the screen.

Transform:

You have examined the heat time columns to determine the best performance in each heat according to the sample. You then notice that the data contains headers, but you forget how it was originally sorted. The data now looks like the following:

(rowId)column2column3column4column5
1Racer Y41.33DQ38.04
2RacerHeat 1Heat 2Heat 3
3Racer X37.2238.2237.61
4Racer Z39.2739.0438.85

While you can undo your sort steps to return to the original sort order, this approach works best if you did not include other steps in between that are based on the sort order.

Results:

After you have applied the last header transform, your data should look like the following:

(rowId)RacerHeat_1Heat_2Heat_3
3Racer Y41.33DQ38.04
2Racer X37.2238.2237.61
4Racer Z39.2739.0438.85

Was this page helpful? Let us know how we did:

Send feedback about...

Google Cloud Dataprep Documentation