EXAMPLE - Conditional Calculations Functions

This example illustrates how you can use the following conditional calculation functions to analyze weather data:

  • AVERAGEIF - Average of a set of values by group that meet a specified condition. See AVERAGEIF Function.
  • MINIF - Minimum of a set of values by group that meet a specified condition. See MINIF Function.
  • MAXIF - Maximum of a set of values by group that meet a specified condition. See MAXIF Function.
  • VARIF - Variance of a set of values by group that meet a specified condition. See VARIF Function.
  • STDEVIF - Standard deviation of a set of values by group that meet a specified condition. See STDEVIF Function.

Source:

Here is some example weather data:

datecityrain_cmtemp_Cwind_mph
1/23/17Valleyville0.0012.86.7
1/23/17Center Town0.319.45.3
1/23/17Magic Mountain0.000.07.3
1/24/17Valleyville0.2517.23.3
1/24/17Center Town0.541.17.6
1/24/17Magic Mountain0.325.08.8
1/25/17Valleyville0.023.36.8
1/25/17Center Town0.833.35.1
1/25/17Magic Mountain0.59-1.76.4
1/26/17Valleyville1.0815.04.2
1/26/17Center Town0.966.17.6
1/26/17Magic Mountain0.77-3.93.0
1/27/17Valleyville1.007.22.8
1/27/17Center Town1.3220.00.2
1/27/17Magic Mountain0.775.65.2
1/28/17Valleyville0.12-6.15.1
1/28/17Center Town0.145.04.9
1/28/17Magic Mountain1.501.10.4
1/29/17Valleyville0.3613.37.3
1/29/17Center Town0.756.19.0
1/29/17Magic Mountain0.603.36.0


Transform:

The following computes average temperature for rainy days by city:

derive type:single value:AVERAGEIF(temp_C, rain_cm > 0) group:city as:'avgTempWRain'

The following computes maximum wind for sub-zero days by city:

derive type:single value:MAXIF(wind_mph,temp_C < 0) group:city as:'maxWindSubZero'

This step calculates the minimum temp when the wind is less than 5 mph by city:

derive type:single value:MINIF(temp_C,wind_mph<5) group:city as:'minTempWind5'

This step computes the variance in temperature for rainy days by city:

derive type:single value:VARIF(temp_C,rain_cm >0) group:city as:'varTempWRain'

The following computes the standard deviation in rainfall for Center Town:

derive type:single value:STDEVIF(rain_cm,city=='Center Town') as:'stDevRainCenterTown'

You can use the following transforms to format the generated output. Note the $col placeholder value for the multi-column transforms:

set col:stDevRainCenterTown,maxWindSubZero value:numformat($col,'##.##')

Since the following rely on data that has only one significant digit, you should format them differently:

set col:varTempWRain,avgTempWRain,minTempWind5 value:numformat($col,'##.#')

Results:

Here is some example weather data:

datecityrain_cmtemp_Cwind_mphavgTempWRainmaxWindSubZerominTempWind5varTempWRainstDevRainCenterTown
1/23/17Valleyville0.0012.86.78.35.17.263.80.37
1/23/17Center Town0.319.45.37.3 532.60.37
1/23/17Magic Mountain0.000.07.31.66.43-3.9120.37
1/24/17Valleyville0.2517.23.38.35.17.263.80.37
1/24/17Center Town0.541.17.67.3 532.60.37
1/24/17Magic Mountain0.325.08.81.66.43-3.9120.37
1/25/17Valleyville0.023.36.88.35.17.263.80.37
1/25/17Center Town0.833.35.17.3 532.60.37
1/25/17Magic Mountain0.59-1.76.41.66.43-3.9120.37
1/26/17Valleyville1.0815.04.28.35.17.263.80.37
1/26/17Center Town0.966.17.67.3 532.60.37
1/26/17Magic Mountain0.77-3.93.01.66.43-3.9120.37
1/27/17Valleyville1.007.22.88.35.17.263.80.37
1/27/17Center Town1.3220.00.27.3 532.60.37
1/27/17Magic Mountain0.775.65.21.66.43-3.9120.37
1/28/17Valleyville0.12-6.15.18.35.17.263.80.37
1/28/17Center Town0.145.04.97.3 532.60.37
1/28/17Magic Mountain1.501.10.41.66.43-3.9120.37
1/29/17Valleyville0.3613.37.38.35.17.263.80.37
1/29/17Center Town0.756.19.07.3 532.60.37
1/29/17Magic Mountain0.603.36.01.66.43-3.9120.37

Was this page helpful? Let us know how we did:

Send feedback about...

Google Cloud Dataprep Documentation
Need help? Visit our support page.