REST Resource: organizations.locations.discoveryConfigs

Resource: DiscoveryConfig

Configuration for discovery to scan resources for profile generation. Only one discovery configuration may exist per organization, folder, or project.

The generated data profiles are retained according to the data retention policy.

JSON representation
{
  "name": string,
  "displayName": string,
  "orgConfig": {
    object (OrgConfig)
  },
  "inspectTemplates": [
    string
  ],
  "actions": [
    {
      object (DataProfileAction)
    }
  ],
  "targets": [
    {
      object (DiscoveryTarget)
    }
  ],
  "errors": [
    {
      object (Error)
    }
  ],
  "createTime": string,
  "updateTime": string,
  "lastRunTime": string,
  "status": enum (Status)
}
Fields
name

string

Unique resource name for the DiscoveryConfig, assigned by the service when the DiscoveryConfig is created, for example projects/dlp-test-project/locations/global/discoveryConfigs/53234423.

displayName

string

Display name (max 100 chars)

orgConfig

object (OrgConfig)

Only set when the parent is an org.

inspectTemplates[]

string

Detection logic for profile generation.

Not all template features are used by Discovery. FindingLimits, includeQuote and excludeInfoTypes have no impact on Discovery.

Multiple templates may be provided if there is data in multiple regions. At most one template must be specified per-region (including "global"). Each region is scanned using the applicable template. If no region-specific template is specified, but a "global" template is specified, it will be copied to that region and used instead. If no global or region-specific template is provided for a region with data, that region's data will not be scanned.

For more information, see https://cloud.google.com/sensitive-data-protection/docs/data-profiles#data-residency.

actions[]

object (DataProfileAction)

Actions to execute at the completion of scanning.

targets[]

object (DiscoveryTarget)

Target to match against for determining what to scan and how frequently.

errors[]

object (Error)

Output only. A stream of errors encountered when the config was activated. Repeated errors may result in the config automatically being paused. Output only field. Will return the last 100 errors. Whenever the config is modified this list will be cleared.

createTime

string (Timestamp format)

Output only. The creation timestamp of a DiscoveryConfig.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

updateTime

string (Timestamp format)

Output only. The last update timestamp of a DiscoveryConfig.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

lastRunTime

string (Timestamp format)

Output only. The timestamp of the last time this config was executed.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

status

enum (Status)

Required. A status for this configuration.

OrgConfig

Project and scan location information. Only set when the parent is an org.

JSON representation
{
  "location": {
    object (DiscoveryStartingLocation)
  },
  "projectId": string
}
Fields
location

object (DiscoveryStartingLocation)

The data to scan: folder, org, or project

projectId

string

The project that will run the scan. The DLP service account that exists within this project must have access to all resources that are profiled, and the Cloud DLP API must be enabled.

DiscoveryStartingLocation

The location to begin a discovery scan. Denotes an organization ID or folder ID within an organization.

JSON representation
{

  // Union field location can be only one of the following:
  "organizationId": string,
  "folderId": string
  // End of list of possible types for union field location.
}
Fields
Union field location. The location to be scanned. location can be only one of the following:
organizationId

string (int64 format)

The ID of an organization to scan.

folderId

string (int64 format)

The ID of the Folder within an organization to scan.

DiscoveryTarget

Target used to match against for Discovery.

JSON representation
{

  // Union field target can be only one of the following:
  "bigQueryTarget": {
    object (BigQueryDiscoveryTarget)
  }
  // End of list of possible types for union field target.
}
Fields
Union field target. A target to match against for Discovery. target can be only one of the following:
bigQueryTarget

object (BigQueryDiscoveryTarget)

BigQuery target for Discovery. The first target to match a table will be the one applied.

BigQueryDiscoveryTarget

Target used to match against for discovery with BigQuery tables

JSON representation
{
  "filter": {
    object (DiscoveryBigQueryFilter)
  },
  "conditions": {
    object (DiscoveryBigQueryConditions)
  },

  // Union field frequency can be only one of the following:
  "cadence": {
    object (DiscoveryGenerationCadence)
  },
  "disabled": {
    object (Disabled)
  }
  // End of list of possible types for union field frequency.
}
Fields
filter

object (DiscoveryBigQueryFilter)

Required. The tables the discovery cadence applies to. The first target with a matching filter will be the one to apply to a table.

conditions

object (DiscoveryBigQueryConditions)

In addition to matching the filter, these conditions must be true before a profile is generated.

Union field frequency. The generation rule includes the logic on how frequently to update the data profiles. If not specified, discovery will re-run and update no more than once a month if new columns appear in the table. frequency can be only one of the following:
cadence

object (DiscoveryGenerationCadence)

How often and when to update profiles. New tables that match both the filter and conditions are scanned as quickly as possible depending on system capacity.

disabled

object (Disabled)

Tables that match this filter will not have profiles created.

DiscoveryBigQueryFilter

Determines what tables will have profiles generated within an organization or project. Includes the ability to filter by regular expression patterns on project ID, dataset ID, and table ID.

JSON representation
{

  // Union field filter can be only one of the following:
  "tables": {
    object (BigQueryTableCollection)
  },
  "otherTables": {
    object (AllOtherBigQueryTables)
  }
  // End of list of possible types for union field filter.
}
Fields
Union field filter. Whether the filter applies to a specific set of tables or all other tables within the location being profiled. The first filter to match will be applied, regardless of the condition. If none is set, will default to other_tables. filter can be only one of the following:
tables

object (BigQueryTableCollection)

A specific set of tables for this filter to apply to. A table collection must be specified in only one filter per config. If a table id or dataset is empty, Cloud DLP assumes all tables in that collection must be profiled. Must specify a project ID.

otherTables

object (AllOtherBigQueryTables)

Catch-all. This should always be the last filter in the list because anything above it will apply first. Should only appear once in a configuration. If none is specified, a default one will be added automatically.

BigQueryTableCollection

Specifies a collection of BigQuery tables. Used for Discovery.

JSON representation
{

  // Union field pattern can be only one of the following:
  "includeRegexes": {
    object (BigQueryRegexes)
  }
  // End of list of possible types for union field pattern.
}
Fields
Union field pattern. Maximum of 100 entries. The first filter containing a pattern that matches a table will be used. pattern can be only one of the following:
includeRegexes

object (BigQueryRegexes)

A collection of regular expressions to match a BigQuery table against.

BigQueryRegexes

A collection of regular expressions to determine what tables to match against.

JSON representation
{
  "patterns": [
    {
      object (BigQueryRegex)
    }
  ]
}
Fields
patterns[]

object (BigQueryRegex)

A single BigQuery regular expression pattern to match against one or more tables, datasets, or projects that contain BigQuery tables.

BigQueryRegex

A pattern to match against one or more tables, datasets, or projects that contain BigQuery tables. At least one pattern must be specified. Regular expressions use RE2 syntax; a guide can be found under the google/re2 repository on GitHub.

JSON representation
{
  "projectIdRegex": string,
  "datasetIdRegex": string,
  "tableIdRegex": string
}
Fields
projectIdRegex

string

For organizations, if unset, will match all projects. Has no effect for data profile configurations created within a project.

datasetIdRegex

string

If unset, this property matches all datasets.

tableIdRegex

string

If unset, this property matches all tables.

AllOtherBigQueryTables

This type has no fields.

Catch-all for all other tables not specified by other filters. Should always be last, except for single-table configurations, which will only have a TableReference target.

DiscoveryBigQueryConditions

Requirements that must be true before a table is scanned in discovery for the first time. There is an AND relationship between the top-level attributes. Additionally, minimum conditions with an OR relationship that must be met before Cloud DLP scans a table can be set (like a minimum row count or a minimum table age).

JSON representation
{
  "createdAfter": string,
  "orConditions": {
    object (OrConditions)
  },

  // Union field included_types can be only one of the following:
  "types": {
    object (BigQueryTableTypes)
  },
  "typeCollection": enum (BigQueryTableTypeCollection)
  // End of list of possible types for union field included_types.
}
Fields
createdAfter

string (Timestamp format)

BigQuery table must have been created after this date. Used to avoid backfilling.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

orConditions

object (OrConditions)

At least one of the conditions must be true for a table to be scanned.

Union field included_types. The type of BigQuery tables to scan. If nothing is set the default behavior is to scan only tables of type TABLE and to give errors for all unsupported tables. included_types can be only one of the following:
types

object (BigQueryTableTypes)

Restrict discovery to specific table types.

typeCollection

enum (BigQueryTableTypeCollection)

Restrict discovery to categories of table types.

BigQueryTableTypes

The types of BigQuery tables supported by Cloud DLP.

JSON representation
{
  "types": [
    enum (BigQueryTableType)
  ]
}
Fields
types[]

enum (BigQueryTableType)

A set of BigQuery table types.

BigQueryTableType

Over time new types may be added. Currently VIEW, MATERIALIZED_VIEW, SNAPSHOT, and non-BigLake external tables are not supported.

Enums
BIG_QUERY_TABLE_TYPE_UNSPECIFIED Unused.
BIG_QUERY_TABLE_TYPE_TABLE A normal BigQuery table.
BIG_QUERY_TABLE_TYPE_EXTERNAL_BIG_LAKE A table that references data stored in Cloud Storage.

BigQueryTableTypeCollection

Over time new types may be added. Currently VIEW, MATERIALIZED_VIEW, and SNAPSHOT are not supported.

Enums
BIG_QUERY_COLLECTION_UNSPECIFIED Unused.
BIG_QUERY_COLLECTION_ALL_TYPES Automatically generate profiles for all tables, even if the table type is not yet fully supported for analysis. Profiles for unsupported tables will be generated with errors to indicate their partial support. When full support is added, the tables will automatically be profiled during the next scheduled run.
BIG_QUERY_COLLECTION_ONLY_SUPPORTED_TYPES Only those types fully supported will be profiled. Will expand automatically as Cloud DLP adds support for new table types. Unsupported table types will not have partial profiles generated.

OrConditions

There is an OR relationship between these attributes. They are used to determine if a table should be scanned or not in Discovery.

JSON representation
{
  "minRowCount": integer,
  "minAge": string
}
Fields
minRowCount

integer

Minimum number of rows that should be present before Cloud DLP profiles a table

minAge

string (Duration format)

Minimum age a table must have before Cloud DLP can profile it. Value must be 1 hour or greater.

A duration in seconds with up to nine fractional digits, ending with 's'. Example: "3.5s".

DiscoveryGenerationCadence

What must take place for a profile to be updated and how frequently it should occur. New tables are scanned as quickly as possible depending on system capacity.

JSON representation
{
  "schemaModifiedCadence": {
    object (DiscoverySchemaModifiedCadence)
  },
  "tableModifiedCadence": {
    object (DiscoveryTableModifiedCadence)
  }
}
Fields
schemaModifiedCadence

object (DiscoverySchemaModifiedCadence)

Governs when to update data profiles when a schema is modified.

tableModifiedCadence

object (DiscoveryTableModifiedCadence)

Governs when to update data profiles when a table is modified.

DiscoverySchemaModifiedCadence

The cadence at which to update data profiles when a schema is modified.

JSON representation
{
  "types": [
    enum (BigQuerySchemaModification)
  ],
  "frequency": enum (DataProfileUpdateFrequency)
}
Fields
types[]

enum (BigQuerySchemaModification)

The type of events to consider when deciding if the table's schema has been modified and should have the profile updated. Defaults to NEW_COLUMNS.

frequency

enum (DataProfileUpdateFrequency)

How frequently profiles may be updated when schemas are modified. Defaults to monthly.

BigQuerySchemaModification

Attributes evaluated to determine if a schema has been modified. New values may be added at a later time.

Enums
SCHEMA_MODIFICATION_UNSPECIFIED Unused
SCHEMA_NEW_COLUMNS Profiles should be regenerated when new columns are added to the table. Default.
SCHEMA_REMOVED_COLUMNS Profiles should be regenerated when columns are removed from the table.

DataProfileUpdateFrequency

How frequently data profiles can be updated. New options can be added at a later time.

Enums
UPDATE_FREQUENCY_UNSPECIFIED Unspecified.
UPDATE_FREQUENCY_NEVER After the data profile is created, it will never be updated.
UPDATE_FREQUENCY_DAILY The data profile can be updated up to once every 24 hours.
UPDATE_FREQUENCY_MONTHLY The data profile can be updated up to once every 30 days. Default.

DiscoveryTableModifiedCadence

The cadence at which to update data profiles when a table is modified.

JSON representation
{
  "types": [
    enum (BigQueryTableModification)
  ],
  "frequency": enum (DataProfileUpdateFrequency)
}
Fields
types[]

enum (BigQueryTableModification)

The type of events to consider when deciding if the table has been modified and should have the profile updated. Defaults to MODIFIED_TIMESTAMP.

frequency

enum (DataProfileUpdateFrequency)

How frequently data profiles can be updated when tables are modified. Defaults to never.

BigQueryTableModification

Attributes evaluated to determine if a table has been modified. New values may be added at a later time.

Enums
TABLE_MODIFICATION_UNSPECIFIED Unused.
TABLE_MODIFIED_TIMESTAMP A table will be considered modified when the lastModifiedTime from BigQuery has been updated.

Disabled

This type has no fields.

Do not profile the tables.

Status

Whether the discovery config is currently active. New options may be added at a later time.

Enums
STATUS_UNSPECIFIED Unused
RUNNING The discovery config is currently active.
PAUSED The discovery config is paused temporarily.

Methods

create

Creates a config for discovery to scan and profile storage.

delete

Deletes a discovery configuration.

get

Gets a discovery configuration.

list

Lists discovery configurations.

patch

Updates a discovery configuration.