- JSON representation
- RangeExpectation
- NonNullExpectation
- SetExpectation
- RegexExpectation
- UniquenessExpectation
- StatisticRangeExpectation
- ColumnStatistic
- RowConditionExpectation
- TableConditionExpectation
- SqlAssertion
A rule captures data quality intent about a data source.
JSON representation |
---|
{ "column": string, "ignoreNull": boolean, "dimension": string, "threshold": number, "name": string, "description": string, "suspended": boolean, // Union field |
Fields | |
---|---|
column |
Optional. The unnested column which this rule is evaluated against. |
ignore |
Optional. Rows with This field is only valid for the following type of rules:
|
dimension |
Required. The dimension a rule belongs to. Results are also aggregated at the dimension level. Supported dimensions are ["COMPLETENESS", "ACCURACY", "CONSISTENCY", "VALIDITY", "UNIQUENESS", "FRESHNESS", "VOLUME"] |
threshold |
Optional. The minimum ratio of passing_rows / total_rows required to pass this rule, with a range of [0.0, 1.0]. 0 indicates default value (i.e. 1.0). This field is only valid for row-level type rules. |
name |
Optional. A mutable name for the rule.
|
description |
Optional. Description of the rule.
|
suspended |
Optional. Whether the Rule is active or suspended. Default is false. |
Union field rule_type . The rule-specific configuration. rule_type can be only one of the following: |
|
range |
Row-level rule which evaluates whether each column value lies between a specified range. |
non |
Row-level rule which evaluates whether each column value is null. |
set |
Row-level rule which evaluates whether each column value is contained by a specified set. |
regex |
Row-level rule which evaluates whether each column value matches a specified regex. |
uniqueness |
Row-level rule which evaluates whether each column value is unique. |
statistic |
Aggregate rule which evaluates whether the column aggregate statistic lies between a specified range. |
row |
Row-level rule which evaluates whether each row in a table passes the specified condition. |
table |
Aggregate rule which evaluates whether the provided expression is true for a table. |
sql |
Aggregate rule which evaluates the number of rows returned for the provided statement. If any rows are returned, this rule fails. |
RangeExpectation
Evaluates whether each column value lies between a specified range.
JSON representation |
---|
{ "minValue": string, "maxValue": string, "strictMinEnabled": boolean, "strictMaxEnabled": boolean } |
Fields | |
---|---|
min |
Optional. The minimum column value allowed for a row to pass this validation. At least one of |
max |
Optional. The maximum column value allowed for a row to pass this validation. At least one of |
strict |
Optional. Whether each value needs to be strictly greater than ('>') the minimum, or if equality is allowed. Only relevant if a |
strict |
Optional. Whether each value needs to be strictly lesser than ('<') the maximum, or if equality is allowed. Only relevant if a |
NonNullExpectation
This type has no fields.
Evaluates whether each column value is null.
SetExpectation
Evaluates whether each column value is contained by a specified set.
JSON representation |
---|
{ "values": [ string ] } |
Fields | |
---|---|
values[] |
Optional. Expected values for the column value. |
RegexExpectation
Evaluates whether each column value matches a specified regex.
JSON representation |
---|
{ "regex": string } |
Fields | |
---|---|
regex |
Optional. A regular expression the column value is expected to match. |
UniquenessExpectation
This type has no fields.
Evaluates whether the column has duplicates.
StatisticRangeExpectation
Evaluates whether the column aggregate statistic lies between a specified range.
JSON representation |
---|
{
"statistic": enum ( |
Fields | |
---|---|
statistic |
Optional. The aggregate metric to evaluate. |
min |
Optional. The minimum column statistic value allowed for a row to pass this validation. At least one of |
max |
Optional. The maximum column statistic value allowed for a row to pass this validation. At least one of |
strict |
Optional. Whether column statistic needs to be strictly greater than ('>') the minimum, or if equality is allowed. Only relevant if a |
strict |
Optional. Whether column statistic needs to be strictly lesser than ('<') the maximum, or if equality is allowed. Only relevant if a |
ColumnStatistic
The list of aggregate metrics a rule can be evaluated against.
Enums | |
---|---|
STATISTIC_UNDEFINED |
Unspecified statistic type |
MEAN |
Evaluate the column mean |
MIN |
Evaluate the column min |
MAX |
Evaluate the column max |
RowConditionExpectation
Evaluates whether each row passes the specified condition.
The SQL expression needs to use BigQuery standard SQL syntax and should produce a boolean value per row as the result.
Example: col1 >= 0 AND col2 < 10
JSON representation |
---|
{ "sqlExpression": string } |
Fields | |
---|---|
sql |
Optional. The SQL expression. |
TableConditionExpectation
Evaluates whether the provided expression is true.
The SQL expression needs to use BigQuery standard SQL syntax and should produce a scalar boolean result.
Example: MIN(col1) >= 0
JSON representation |
---|
{ "sqlExpression": string } |
Fields | |
---|---|
sql |
Optional. The SQL expression. |
SqlAssertion
A SQL statement that is evaluated to return rows that match an invalid state. If any rows are are returned, this rule fails.
The SQL statement must use BigQuery standard SQL syntax, and must not contain any semicolons.
You can use the data reference parameter ${data()}
to reference the source table with all of its precondition filters applied. Examples of precondition filters include row filters, incremental data filters, and sampling. For more information, see Data reference parameter.
Example: SELECT * FROM ${data()} WHERE price < 0
JSON representation |
---|
{ "sqlStatement": string } |
Fields | |
---|---|
sql |
Optional. The SQL statement. |