DataProfileResult

JSON representation
Profile
- JSON representation
Field
- JSON representation
ProfileInfo
- JSON representation
TopNValue
- JSON representation
StringFieldInfo
- JSON representation
IntegerFieldInfo
- JSON representation
DoubleFieldInfo
- JSON representation
PostScanActionsResult
- JSON representation
BigQueryExportResult
- JSON representation
State

DataProfileResult defines the output of DataProfileScan. Each field of the table will have field type specific profile result.

JSON representation
{ "rowCount": string, "profile": { object (`Profile`) }, "scannedData": { object (`ScannedData`) }, "postScanActionsResult": { object (`PostScanActionsResult`) } }

Fields
`rowCount`	`string (int64 format)` The count of rows scanned.
`profile`	`object (Profile)` The profile information per field.
`scannedData`	`object (ScannedData)` The data scanned for this result.
`postScanActionsResult`	`object (PostScanActionsResult)` Output only. The result of post scan actions.

Profile

Contains name, type, mode and field type specific profile information.

JSON representation
{ "fields": [ { object (`Field`) } ] }

Fields

Fields
`fields[]`	`object (Field)` List of fields with structural and profile information for each field.

fields[]

object (Field)

List of fields with structural and profile information for each field.

Field

A field within a table.

JSON representation
{ "name": string, "type": string, "mode": string, "profile": { object (`ProfileInfo`) } }

Fields
`name`	`string` The name of the field.
`type`	`string` The data type retrieved from the schema of the data source. For instance, for a BigQuery native table, it is the BigQuery Table Schema. For a Dataplex Entity, it is the Entity Schema.
`mode`	`string` The mode of the field. Possible values include: REQUIRED, if it is a required field. NULLABLE, if it is an optional field. REPEATED, if it is a repeated field.
`profile`	`object (ProfileInfo)` Profile information for the corresponding field.

ProfileInfo

The profile information for each field type.

JSON representation

JSON representation
{ "nullRatio": number, "distinctRatio": number, "topNValues": [ { object (`TopNValue`) } ], // Union field `field_info` can be only one of the following: "stringProfile": { object (`StringFieldInfo`) }, "integerProfile": { object (`IntegerFieldInfo`) }, "doubleProfile": { object (`DoubleFieldInfo`) } // End of list of possible types for union field `field_info`. }

{
  "nullRatio": number,
  "distinctRatio": number,
  "topNValues": [
    {
      object (TopNValue)
    }
  ],

  // Union field field_info can be only one of the following:
  "stringProfile": {
    object (StringFieldInfo)
  },
  "integerProfile": {
    object (IntegerFieldInfo)
  },
  "doubleProfile": {
    object (DoubleFieldInfo)
  }
  // End of list of possible types for union field field_info.
}

Fields
`nullRatio`	`number` Ratio of rows with null value against total scanned rows.
`distinctRatio`	`number` Ratio of rows with distinct values against total scanned rows. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode.
`topNValues[]`	`object (TopNValue)` The list of top N non-null values, frequency and ratio with which they occur in the scanned data. N is 10 or equal to the number of distinct values in the field, whichever is smaller. Not available for complex non-groupable field type, including RECORD, ARRAY, GEOGRAPHY, and JSON, as well as fields with REPEATABLE mode.
Union field `field_info`. Structural and profile information for specific field type. Not available, if mode is REPEATABLE. `field_info` can be only one of the following:
`stringProfile`	`object (StringFieldInfo)` String type field information.
`integerProfile`	`object (IntegerFieldInfo)` Integer type field information.
`doubleProfile`	`object (DoubleFieldInfo)` Double type field information.

TopNValue

Top N non-null values in the scanned data.

JSON representation
{ "value": string, "count": string, "ratio": number }

Fields

Fields
`value`	`string` String value of a top N non-null value.
`count`	`string (int64 format)` Count of the corresponding value in the scanned data.
`ratio`	`number` Ratio of the corresponding value in the field against the total number of rows in the scanned data.

value

string

String value of a top N non-null value.

count

string (int64 format)

Count of the corresponding value in the scanned data.

ratio

number

Ratio of the corresponding value in the field against the total number of rows in the scanned data.

StringFieldInfo

The profile information for a string type field.

JSON representation
{ "minLength": string, "maxLength": string, "averageLength": number }

Fields

Fields
`minLength`	`string (int64 format)` Minimum length of non-null values in the scanned data.
`maxLength`	`string (int64 format)` Maximum length of non-null values in the scanned data.
`averageLength`	`number` Average length of non-null values in the scanned data.

minLength

string (int64 format)

Minimum length of non-null values in the scanned data.

maxLength

string (int64 format)

Maximum length of non-null values in the scanned data.

averageLength

number

Average length of non-null values in the scanned data.

IntegerFieldInfo

The profile information for an integer type field.

JSON representation
{ "average": number, "standardDeviation": number, "min": string, "quartiles": [ string ], "max": string }

Fields
`average`	`number` Average of non-null values in the scanned data. NaN, if the field has a NaN.
`standardDeviation`	`number` Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN.
`min`	`string (int64 format)` Minimum of non-null values in the scanned data. NaN, if the field has a NaN.
`quartiles[]`	`string (int64 format)` A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of approximate quartile values for the scanned data, occurring in order Q1, median, Q3.
`max`	`string (int64 format)` Maximum of non-null values in the scanned data. NaN, if the field has a NaN.

DoubleFieldInfo

The profile information for a double type field.

JSON representation
{ "average": number, "standardDeviation": number, "min": number, "quartiles": [ number ], "max": number }

Fields
`average`	`number` Average of non-null values in the scanned data. NaN, if the field has a NaN.
`standardDeviation`	`number` Standard deviation of non-null values in the scanned data. NaN, if the field has a NaN.
`min`	`number` Minimum of non-null values in the scanned data. NaN, if the field has a NaN.
`quartiles[]`	`number` A quartile divides the number of data points into four parts, or quarters, of more-or-less equal size. Three main quartiles used are: The first quartile (Q1) splits off the lowest 25% of data from the highest 75%. It is also known as the lower or 25th empirical quartile, as 25% of the data is below this point. The second quartile (Q2) is the median of a data set. So, 50% of the data lies below this point. The third quartile (Q3) splits off the highest 25% of data from the lowest 75%. It is known as the upper or 75th empirical quartile, as 75% of the data lies below this point. Here, the quartiles is provided as an ordered list of quartile values for the scanned data, occurring in order Q1, median, Q3.
`max`	`number` Maximum of non-null values in the scanned data. NaN, if the field has a NaN.

PostScanActionsResult

The result of post scan actions of DataProfileScan job.

JSON representation
{ "bigqueryExportResult": { object (`BigQueryExportResult`) } }

Fields

Fields
`bigqueryExportResult`	`object (BigQueryExportResult)` Output only. The result of BigQuery export post scan action.

bigqueryExportResult

object (BigQueryExportResult)

Output only. The result of BigQuery export post scan action.

BigQueryExportResult

The result of BigQuery export post scan action.

JSON representation
{ "state": enum (`State`), "message": string }

Fields

Fields
`state`	`enum (State)` Output only. Execution state for the BigQuery exporting.
`message`	`string` Output only. Additional information about the BigQuery exporting.

state

enum (State)

Output only. Execution state for the BigQuery exporting.

message

string

Output only. Additional information about the BigQuery exporting.

State

Execution state for the exporting.

Enums
`STATE_UNSPECIFIED`	The exporting state is unspecified.
`SUCCEEDED`	The exporting completed successfully.
`FAILED`	The exporting is no longer running due to an error.
`SKIPPED`	The exporting is skipped due to no valid scan result to export (usually caused by scan failed).