Class DataQualityRule (2.3.0)

DataQualityRule(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A rule captures data quality intent about a data source.

This message has oneof_ fields (mutually exclusive fields). For each oneof, at most one member field can be set at the same time. Setting any member of the oneof automatically clears all other members.

.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

Attributes

Name Description
range_expectation google.cloud.dataplex_v1.types.DataQualityRule.RangeExpectation
Row-level rule which evaluates whether each column value lies between a specified range. This field is a member of oneof_ rule_type.
non_null_expectation google.cloud.dataplex_v1.types.DataQualityRule.NonNullExpectation
Row-level rule which evaluates whether each column value is null. This field is a member of oneof_ rule_type.
set_expectation google.cloud.dataplex_v1.types.DataQualityRule.SetExpectation
Row-level rule which evaluates whether each column value is contained by a specified set. This field is a member of oneof_ rule_type.
regex_expectation google.cloud.dataplex_v1.types.DataQualityRule.RegexExpectation
Row-level rule which evaluates whether each column value matches a specified regex. This field is a member of oneof_ rule_type.
uniqueness_expectation google.cloud.dataplex_v1.types.DataQualityRule.UniquenessExpectation
Row-level rule which evaluates whether each column value is unique. This field is a member of oneof_ rule_type.
statistic_range_expectation google.cloud.dataplex_v1.types.DataQualityRule.StatisticRangeExpectation
Aggregate rule which evaluates whether the column aggregate statistic lies between a specified range. This field is a member of oneof_ rule_type.
row_condition_expectation google.cloud.dataplex_v1.types.DataQualityRule.RowConditionExpectation
Row-level rule which evaluates whether each row in a table passes the specified condition. This field is a member of oneof_ rule_type.
table_condition_expectation google.cloud.dataplex_v1.types.DataQualityRule.TableConditionExpectation
Aggregate rule which evaluates whether the provided expression is true for a table. This field is a member of oneof_ rule_type.
sql_assertion google.cloud.dataplex_v1.types.DataQualityRule.SqlAssertion
Aggregate rule which evaluates the number of rows returned for the provided statement. If any rows are returned, this rule fails. This field is a member of oneof_ rule_type.
column str
Optional. The unnested column which this rule is evaluated against.
ignore_null bool
Optional. Rows with null values will automatically fail a rule, unless ignore_null is true. In that case, such null rows are trivially considered passing. This field is only valid for the following type of rules: - RangeExpectation - RegexExpectation - SetExpectation - UniquenessExpectation
dimension str
Required. The dimension a rule belongs to. Results are also aggregated at the dimension level. Supported dimensions are **["COMPLETENESS", "ACCURACY", "CONSISTENCY", "VALIDITY", "UNIQUENESS", "INTEGRITY"]**
threshold float
Optional. The minimum ratio of **passing_rows / total_rows** required to pass this rule, with a range of [0.0, 1.0]. 0 indicates default value (i.e. 1.0). This field is only valid for row-level type rules.
name str
Optional. A mutable name for the rule. - The name must contain only letters (a-z, A-Z), numbers (0-9), or hyphens (-). - The maximum length is 63 characters. - Must start with a letter. - Must end with a number or a letter.
description str
Optional. Description of the rule. - The maximum length is 1,024 characters.

Classes

NonNullExpectation

NonNullExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether each column value is null.

RangeExpectation

RangeExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether each column value lies between a specified range.

RegexExpectation

RegexExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether each column value matches a specified regex.

RowConditionExpectation

RowConditionExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether each row passes the specified condition.

The SQL expression needs to use BigQuery standard SQL syntax and should produce a boolean value per row as the result.

Example: col1 >= 0 AND col2 < 10

SetExpectation

SetExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether each column value is contained by a specified set.

SqlAssertion

SqlAssertion(mapping=None, *, ignore_unknown_fields=False, **kwargs)

A SQL statement that is evaluated to return rows that match an invalid state. If any rows are are returned, this rule fails.

The SQL statement must use BigQuery standard SQL syntax, and must not contain any semicolons.

You can use the data reference parameter ${data()} to reference the source table with all of its precondition filters applied. Examples of precondition filters include row filters, incremental data filters, and sampling. For more information, see Data reference parameter <https://cloud.google.com/dataplex/docs/auto-data-quality-overview#data-reference-parameter>__.

Example: SELECT * FROM ${data()} WHERE price < 0

StatisticRangeExpectation

StatisticRangeExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether the column aggregate statistic lies between a specified range.

TableConditionExpectation

TableConditionExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether the provided expression is true.

The SQL expression needs to use BigQuery standard SQL syntax and should produce a scalar boolean result.

Example: MIN(col1) >= 0

UniquenessExpectation

UniquenessExpectation(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Evaluates whether the column has duplicates.