# VAR Function

Computes the variance among all values in a column. Input column can be of Integer or Decimal. If no numeric values are detected in the input column, the function returns `0`.

The variance of a set of values attempts to measure the spread in values around the mean. A variance of zero means that all values are the same, and a small variance means that the values are closely bunched together. A high value for variance indicates that the numbers are spread out widely. Variance is always a positive value.

Var(X) = [Sum ((X - mean(X))2)] / Count(X)

If a row contains a missing or null value, it is not factored into the calculation.

Terms...

Relevant terms:

Term Description
Population Population statistical functions are computed from all possible values. See https://en.wikipedia.org/wiki/Statistical_population.
Sample

Sample-based statistical functions are computed from a subset or sample of all values. See https://en.wikipedia.org/wiki/Sampling_(statistics).

These function names include `SAMP` in their name.

NOTE: Statistical sampling has no relationship to the samples taken within the product. When statistical functions are computed during job execution, they are applied across the entire dataset. Sample method calculations are computed at that time.

The square root of variance is standard deviation, which is used to measure variance under the assumption of a bell curve distribution. See STDEV Function.

For a version of this function computed over a rolling window of rows, see ROLLINGVAR Function.

Wrangle vs. SQL: This function is part of Wrangle, a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.

## Basic Usage

`var(myRating)`

Output: Returns the variance of the group of values from the `myRating` column.

## Syntax and Arguments

`var(function_col_ref) [group:group_col_ref] [limit:limit_count]`

ArgumentRequired?Data TypeDescription
function_col_refYstringName of column to which to apply the function

For more information on the `group` and `limit` parameters, see Pivot Transform.

### function_col_ref

Name of the column the values of which you want to calculate the variance. Column must contain Integer or Decimal values.

• Literal values are not supported as inputs.
• Multiple columns and wildcards are not supported.

Usage Notes:

Required?Data TypeExample Value
YesString (column reference)`myValues`

## Examples

Tip: For additional examples, see How-to Guides.

This example illustrates how you can apply statistical functions to your dataset. Calculations include average (mean), max, min, standard deviation, and variance.

Source:

Students took a test and recorded the following scores. You want to perform some statistical analysis on them:

StudentScore
Anna84
Ben71
Caleb76
Danielle87
Evan85
Faith92
Gabe85
Hannah99
Ian73
Jane68

Transformation:

You can use the following transformations to calculate the average (mean), minimum, and maximum scores:

Transformation Name `New formula` `Single row formula` `AVERAGE(Score)` `'avgScore'`

Transformation Name `New formula` `Single row formula` `MIN(Score)` `'minScore'`

Transformation Name `New formula` `Single row formula` `MAX(Score)` `'maxScore'`

To apply statistical functions to your data, you can use the `VAR` and `STDEV` functions, which can be used as the basis for other statistical calculations.

Transformation Name `New formula` `Single row formula` `VAR(Score)` `var_Score`

Transformation Name `New formula` `Single row formula` `STDEV(Score)` `stdev_Score`

For each score, you can now calculate the variation of each one from the average, using the following:

Transformation Name `New formula` `Single row formula` `((Score - avg_Score) / stdev_Score)` `'stDevs'`

Now, you want to apply grades based on a formula:

AstDevs > 1
BstDevs > 0.5
C-1 <= stDevs <= 0.5
DstDevs < -1
FstDevs < -2

You can build the following transformation using the `IF` function to calculate grades.

Transformation Name `New formula` `Single row formula` `IF((stDevs > 1),'A',IF((stDevs < -2),'F',IF((stDevs < -1),'D',IF((stDevs > 0.5),'B','C'))))`

To clean up the content, you might want to apply some formatting to the score columns. The following reformats the `stdev_Score` and `stDevs` columns to display two decimal places:

Transformation Name `Edit column with formula` `stdev_Score` `NUMFORMAT(stdev_Score, '##.00')`

Transformation Name `Edit column with formula` `stDevs` `NUMFORMAT(stDevs, '##.00')`

Transformation Name `New formula` `Single row formula` `MODE(Score)` `'modeScore'`

Results:

Anna8485826899

87.00000000000001

9.330.21C
Ben718582689987.000000000000019.33-1.18D
Caleb768582689987.000000000000019.33-0.64C
Danielle878582689987.000000000000019.330.54B
Evan858582689987.000000000000019.330.32C
Faith928582689987.000000000000019.331.07A
Gabe858582689987.000000000000019.330.32C
Hannah998582689987.000000000000019.331.82A
Ian738582689987.000000000000019.33-0.96C
Jane688582689987.000000000000019.33-1.50D

[{ "type": "thumb-down", "id": "hardToUnderstand", "label":"Hard to understand" },{ "type": "thumb-down", "id": "incorrectInformationOrSampleCode", "label":"Incorrect information or sample code" },{ "type": "thumb-down", "id": "missingTheInformationSamplesINeed", "label":"Missing the information/samples I need" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }]
[{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]