REST Resource: organizations.locations.columnDataProfiles

Resource: ColumnDataProfile

The profile for a scanned column within a table.

JSON representation
{
  "name": string,
  "profileStatus": {
    object (ProfileStatus)
  },
  "state": enum (State),
  "profileLastGenerated": string,
  "tableDataProfile": string,
  "tableFullResource": string,
  "datasetProjectId": string,
  "datasetLocation": string,
  "datasetId": string,
  "tableId": string,
  "column": string,
  "sensitivityScore": {
    object (SensitivityScore)
  },
  "dataRiskLevel": {
    object (DataRiskLevel)
  },
  "columnInfoType": {
    object (InfoTypeSummary)
  },
  "otherMatches": [
    {
      object (OtherInfoTypeSummary)
    }
  ],
  "estimatedNullPercentage": enum (NullPercentageLevel),
  "estimatedUniquenessScore": enum (UniquenessScoreLevel),
  "freeTextScore": number,
  "columnType": enum (ColumnDataType),
  "policyState": enum (ColumnPolicyState)
}
Fields
name

string

The name of the profile.

profileStatus

object (ProfileStatus)

Success or error status from the most recent profile generation attempt. May be empty if the profile is still being generated.

state

enum (State)

State of a profile.

profileLastGenerated

string (Timestamp format)

The last time the profile was generated.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

tableDataProfile

string

The resource name of the table data profile.

tableFullResource

string

The resource name of the resource this column is within.

datasetProjectId

string

The Google Cloud project ID that owns the profiled resource.

datasetLocation

string

The BigQuery location where the dataset's data is stored. See https://cloud.google.com/bigquery/docs/locations for supported locations.

datasetId

string

The BigQuery dataset ID.

tableId

string

The BigQuery table ID.

column

string

The name of the column.

sensitivityScore

object (SensitivityScore)

The sensitivity of this column.

dataRiskLevel

object (DataRiskLevel)

The data risk level for this column.

columnInfoType

object (InfoTypeSummary)

If it's been determined this column can be identified as a single type, this will be set. Otherwise the column either has unidentifiable content or mixed types.

otherMatches[]

object (OtherInfoTypeSummary)

Other types found within this column. List will be unordered.

estimatedNullPercentage

enum (NullPercentageLevel)

Approximate percentage of entries being null in the column.

estimatedUniquenessScore

enum (UniquenessScoreLevel)

Approximate uniqueness of the column.

freeTextScore

number

The likelihood that this column contains free-form text. A value close to 1 may indicate the column is likely to contain free-form or natural language text. Range in 0-1.

columnType

enum (ColumnDataType)

The data type of a given column.

policyState

enum (ColumnPolicyState)

Indicates if a policy tag has been applied to the column.

State

Possible states of a profile. New items may be added.

Enums
STATE_UNSPECIFIED Unused.
RUNNING The profile is currently running. Once a profile has finished it will transition to DONE.
DONE The profile is no longer generating. If profileStatus.status.code is 0, the profile succeeded, otherwise, it failed.

NullPercentageLevel

Bucketized nullness percentage levels. A higher level means a higher percentage of the column is null.

Enums
NULL_PERCENTAGE_LEVEL_UNSPECIFIED Unused.
NULL_PERCENTAGE_VERY_LOW Very few null entries.
NULL_PERCENTAGE_LOW Some null entries.
NULL_PERCENTAGE_MEDIUM A few null entries.
NULL_PERCENTAGE_HIGH A lot of null entries.

UniquenessScoreLevel

Bucketized uniqueness score levels. A higher uniqueness score is a strong signal that the column may contain a unique identifier like user id. A low value indicates that the column contains few unique values like booleans or other classifiers.

Enums
UNIQUENESS_SCORE_LEVEL_UNSPECIFIED Some columns do not have estimated uniqueness. Possible reasons include having too few values.
UNIQUENESS_SCORE_LOW Low uniqueness, possibly a boolean, enum or similiarly typed column.
UNIQUENESS_SCORE_MEDIUM Medium uniqueness.
UNIQUENESS_SCORE_HIGH High uniqueness, possibly a column of free text or unique identifiers.

ColumnDataType

Data types of the data in a column. Types may be added over time.

Enums
COLUMN_DATA_TYPE_UNSPECIFIED Invalid type.
TYPE_INT64 Encoded as a string in decimal format.
TYPE_BOOL Encoded as a boolean "false" or "true".
TYPE_FLOAT64 Encoded as a number, or string "NaN", "Infinity" or "-Infinity".
TYPE_STRING Encoded as a string value.
TYPE_BYTES Encoded as a base64 string per RFC 4648, section 4.
TYPE_TIMESTAMP Encoded as an RFC 3339 timestamp with mandatory "Z" time zone string: 1985-04-12T23:20:50.52Z
TYPE_DATE Encoded as RFC 3339 full-date format string: 1985-04-12
TYPE_TIME Encoded as RFC 3339 partial-time format string: 23:20:50.52
TYPE_DATETIME Encoded as RFC 3339 full-date "T" partial-time: 1985-04-12T23:20:50.52
TYPE_GEOGRAPHY Encoded as WKT
TYPE_NUMERIC Encoded as a decimal string.
TYPE_RECORD Container of ordered fields, each with a type and field name.
TYPE_BIGNUMERIC Decimal type.
TYPE_JSON Json type.

ColumnPolicyState

The possible policy states for a column.

Enums
COLUMN_POLICY_STATE_UNSPECIFIED No policy tags.
COLUMN_POLICY_TAGGED Column has policy tag applied.

Methods

get

Gets a column data profile.

list

Lists column data profiles for an organization.