Package google.cloud.datacatalog.lineage.v1

Index

Lineage

Lineage is used to track data flows between assets over time. You can create LineageEvents to record lineage between multiple sources and a single target, for example, when table data is based on data from multiple tables.

BatchSearchLinkProcesses

rpc BatchSearchLinkProcesses(BatchSearchLinkProcessesRequest) returns (BatchSearchLinkProcessesResponse)

Retrieve information about LineageProcesses associated with specific links. LineageProcesses are transformation pipelines that result in data flowing from source to target assets. Links between assets represent this operation.

If you have specific link names, you can use this method to verify which LineageProcesses contribute to creating those links. See the SearchLinks method for more information on how to retrieve link name.

You can retrieve the LineageProcess information in every project where you have the datalineage.events.get permission. The project provided in the URL is used for Billing and Quota.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • datalineage.locations.searchLinks

For more information, see the IAM documentation.

CreateLineageEvent

rpc CreateLineageEvent(CreateLineageEventRequest) returns (LineageEvent)

Creates a new lineage event.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • datalineage.events.create

For more information, see the IAM documentation.

CreateProcess

rpc CreateProcess(CreateProcessRequest) returns (Process)

Creates a new process.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • datalineage.processes.create

For more information, see the IAM documentation.

CreateRun

rpc CreateRun(CreateRunRequest) returns (Run)

Creates a new run.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • datalineage.runs.create

For more information, see the IAM documentation.

DeleteLineageEvent

rpc DeleteLineageEvent(DeleteLineageEventRequest) returns (Empty)

Deletes the lineage event with the specified name.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • datalineage.events.delete

For more information, see the IAM documentation.

DeleteProcess

rpc DeleteProcess(DeleteProcessRequest) returns (Operation)

Deletes the process with the specified name.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • datalineage.processes.delete

For more information, see the IAM documentation.

DeleteRun

rpc DeleteRun(DeleteRunRequest) returns (Operation)

Deletes the run with the specified name.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • datalineage.runs.delete

For more information, see the IAM documentation.

GetLineageEvent

rpc GetLineageEvent(GetLineageEventRequest) returns (LineageEvent)

Gets details of a specified lineage event.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • datalineage.events.get

For more information, see the IAM documentation.

GetProcess

rpc GetProcess(GetProcessRequest) returns (Process)

Gets the details of the specified process.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • datalineage.processes.get

For more information, see the IAM documentation.

GetRun

rpc GetRun(GetRunRequest) returns (Run)

Gets the details of the specified run.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • datalineage.runs.get

For more information, see the IAM documentation.

ListLineageEvents

rpc ListLineageEvents(ListLineageEventsRequest) returns (ListLineageEventsResponse)

Lists lineage events in the given project and location. The list order is not defined.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • datalineage.events.list

For more information, see the IAM documentation.

ListProcesses

rpc ListProcesses(ListProcessesRequest) returns (ListProcessesResponse)

List processes in the given project and location. List order is descending by insertion time.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • datalineage.processes.list

For more information, see the IAM documentation.

ListRuns

rpc ListRuns(ListRunsRequest) returns (ListRunsResponse)

Lists runs in the given project and location. List order is descending by start_time.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the parent resource:

  • datalineage.runs.list

For more information, see the IAM documentation.

ProcessOpenLineageRunEvent

rpc ProcessOpenLineageRunEvent(ProcessOpenLineageRunEventRequest) returns (ProcessOpenLineageRunEventResponse)

Creates new lineage events together with their parents: process and run. Updates the process and run if they already exist. Mapped from Open Lineage specification: https://github.com/OpenLineage/OpenLineage/blob/main/spec/OpenLineage.json.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

UpdateProcess

rpc UpdateProcess(UpdateProcessRequest) returns (Process)

Updates a process.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • datalineage.processes.update

For more information, see the IAM documentation.

UpdateRun

rpc UpdateRun(UpdateRunRequest) returns (Run)

Updates a run.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the name resource:

  • datalineage.runs.update

For more information, see the IAM documentation.

BatchSearchLinkProcessesRequest

Request message for BatchSearchLinkProcesses.

Fields
parent

string

Required. The project and location where you want to search.

page_size

int32

The maximum number of processes to return in a single page of the response. A page may contain fewer results than this value.

page_token

string

The page token received from a previous BatchSearchLinkProcesses call. Use it to get the next page.

When requesting subsequent pages of a response, remember that all parameters must match the values you provided in the original request.

BatchSearchLinkProcessesResponse

Response message for BatchSearchLinkProcesses.

Fields
next_page_token

string

The token to specify as page_token in the subsequent call to get the next page. Omitted if there are no more pages in the response.

CreateLineageEventRequest

Request message for CreateLineageEvent.

Fields
parent

string

Required. The name of the run that should own the lineage event.

lineage_event

LineageEvent

Required. The lineage event to create.

request_id

string

A unique identifier for this request. Restricted to 36 ASCII characters. A random UUID is recommended. This request is idempotent only if a request_id is provided.

CreateProcessRequest

Request message for CreateProcess.

Fields
parent

string

Required. The name of the project and its location that should own the process.

process

Process

Required. The process to create.

request_id

string

A unique identifier for this request. Restricted to 36 ASCII characters. A random UUID is recommended. This request is idempotent only if a request_id is provided.

CreateRunRequest

Request message for CreateRun.

Fields
parent

string

Required. The name of the process that should own the run.

run

Run

Required. The run to create.

request_id

string

A unique identifier for this request. Restricted to 36 ASCII characters. A random UUID is recommended. This request is idempotent only if a request_id is provided.

DeleteLineageEventRequest

Request message for DeleteLineageEvent.

Fields
name

string

Required. The name of the lineage event to delete.

allow_missing

bool

If set to true and the lineage event is not found, the request succeeds but the server doesn't perform any actions.

DeleteProcessRequest

Request message for DeleteProcess.

Fields
name

string

Required. The name of the process to delete.

allow_missing

bool

If set to true and the process is not found, the request succeeds but the server doesn't perform any actions.

DeleteRunRequest

Request message for DeleteRun.

Fields
name

string

Required. The name of the run to delete.

allow_missing

bool

If set to true and the run is not found, the request succeeds but the server doesn't perform any actions.

EntityReference

The soft reference to everything you can attach a lineage event to.

Fields
fully_qualified_name

string

Required. Fully Qualified Name (FQN) of the entity.

GetLineageEventRequest

Request message for GetLineageEvent.

Fields
name

string

Required. The name of the lineage event to get.

GetProcessRequest

Request message for GetProcess.

Fields
name

string

Required. The name of the process to get.

GetRunRequest

Request message for GetRun.

Fields
name

string

Required. The name of the run to get.

LineageEvent

A lineage event represents an operation on assets. Within the operation, the data flows from the source to the target defined in the links field.

Fields
name

string

Immutable. The resource name of the lineage event. Format: projects/{project}/locations/{location}/processes/{process}/runs/{run}/lineageEvents/{lineage_event}. Can be specified or auto-assigned. {lineage_event} must be not longer than 200 characters and only contain characters in a set: a-zA-Z0-9_-:.

start_time

Timestamp

Required. The beginning of the transformation which resulted in this lineage event. For streaming scenarios, it should be the beginning of the period from which the lineage is being reported.

end_time

Timestamp

Optional. The end of the transformation which resulted in this lineage event. For streaming scenarios, it should be the end of the period from which the lineage is being reported.

ListLineageEventsRequest

Request message for ListLineageEvents.

Fields
parent

string

Required. The name of the run that owns the collection of lineage events to get.

page_size

int32

The maximum number of lineage events to return.

The service may return fewer events than this value. If unspecified, at most 50 events are returned. The maximum value is 100; values greater than 100 are cut to 100.

page_token

string

The page token received from a previous ListLineageEvents call. Specify it to get the next page.

When paginating, all other parameters specified in this call must match the parameters of the call that provided the page token.

ListLineageEventsResponse

Response message for ListLineageEvents.

Fields
lineage_events[]

LineageEvent

Lineage events from the specified project and location.

next_page_token

string

The token to specify as page_token in the next call to get the next page. If this field is omitted, there are no subsequent pages.

ListProcessesRequest

Request message for ListProcesses.

Fields
parent

string

Required. The name of the project and its location that owns this collection of processes.

page_size

int32

The maximum number of processes to return. The service may return fewer than this value. If unspecified, at most 50 processes are returned. The maximum value is 100; values greater than 100 are cut to 100.

page_token

string

The page token received from a previous ListProcesses call. Specify it to get the next page.

When paginating, all other parameters specified in this call must match the parameters of the call that provided the page token.

ListProcessesResponse

Response message for ListProcesses.

Fields
processes[]

Process

The processes from the specified project and location.

next_page_token

string

The token to specify as page_token in the next call to get the next page. If this field is omitted, there are no subsequent pages.

ListRunsRequest

Request message for ListRuns.

Fields
parent

string

Required. The name of process that owns this collection of runs.

page_size

int32

The maximum number of runs to return. The service may return fewer than this value. If unspecified, at most 50 runs are returned. The maximum value is 100; values greater than 100 are cut to 100.

page_token

string

The page token received from a previous ListRuns call. Specify it to get the next page.

When paginating, all other parameters specified in this call must match the parameters of the call that provided the page token.

ListRunsResponse

Response message for ListRuns.

Fields
runs[]

Run

The runs from the specified project and location.

next_page_token

string

The token to specify as page_token in the next call to get the next page. If this field is omitted, there are no subsequent pages.

OperationMetadata

Metadata describing the operation.

Fields
state

State

Output only. The current operation state.

operation_type

Type

Output only. The type of the operation being performed.

resource

string

Output only. The relative name of the resource being operated on.

resource_uuid

string

Output only. The UUID of the resource being operated on.

create_time

Timestamp

Output only. The timestamp of the operation submission to the server.

end_time

Timestamp

Output only. The timestamp of the operation termination, regardless of its success. This field is unset if the operation is still ongoing.

State

An enum with the state of the operation.

Enums
STATE_UNSPECIFIED Unused.
PENDING The operation has been created but is not yet started.
RUNNING The operation is underway.
SUCCEEDED The operation completed successfully.
FAILED The operation is no longer running and did not succeed.

Type

Type of the long running operation.

Enums
TYPE_UNSPECIFIED Unused.
DELETE The resource deletion operation.
CREATE The resource creation operation.

Origin

Origin of a process.

Fields
source_type

SourceType

Type of the source.

Use of a source_type other than CUSTOM for process creation or updating is highly discouraged, and may be restricted in the future without notice.

name

string

If the source_type isn't CUSTOM, the value of this field should be a GCP resource name of the system, which reports lineage. The project and location parts of the resource name must match the project and location of the lineage resource being created. Examples:

  • {source_type: COMPOSER, name: "projects/foo/locations/us/environments/bar"}
  • {source_type: BIGQUERY, name: "projects/foo/locations/eu"}
  • {source_type: CUSTOM, name: "myCustomIntegration"}

SourceType

Type of the source of a process.

Enums
SOURCE_TYPE_UNSPECIFIED Source is Unspecified
CUSTOM A custom source
BIGQUERY BigQuery
DATA_FUSION Data Fusion
COMPOSER Composer
LOOKER_STUDIO Looker Studio
DATAPROC Dataproc

Process

A process is the definition of a data transformation operation.

Fields
name

string

Immutable. The resource name of the lineage process. Format: projects/{project}/locations/{location}/processes/{process}. Can be specified or auto-assigned. {process} must be not longer than 200 characters and only contain characters in a set: a-zA-Z0-9_-:.

display_name

string

Optional. A human-readable name you can set to display in a user interface. Must be not longer than 200 characters and only contain UTF-8 letters or numbers, spaces or characters like _-:&.

attributes

map<string, Value>

Optional. The attributes of the process. Should only be used for the purpose of non-semantic management (classifying, describing or labeling the process).

Up to 100 attributes are allowed.

origin

Origin

Optional. The origin of this process and its runs and lineage events.

ProcessLinkInfo

Link details.

Fields
start_time

Timestamp

The start of the first event establishing this link-process tuple.

end_time

Timestamp

The end of the last event establishing this link-process tuple.

ProcessOpenLineageRunEventRequest

Request message for ProcessOpenLineageRunEvent.

Fields
parent

string

Required. The name of the project and its location that should own the process, run, and lineage event.

open_lineage

Struct

Required. OpenLineage message following OpenLineage format: https://github.com/OpenLineage/OpenLineage/blob/main/spec/OpenLineage.json

request_id

string

A unique identifier for this request. Restricted to 36 ASCII characters. A random UUID is recommended. This request is idempotent only if a request_id is provided.

ProcessOpenLineageRunEventResponse

Response message for ProcessOpenLineageRunEvent.

Fields
process

string

Created process name. Format: projects/{project}/locations/{location}/processes/{process}.

run

string

Created run name. Format: projects/{project}/locations/{location}/processes/{process}/runs/{run}.

lineage_events[]

string

Created lineage event names. Format: projects/{project}/locations/{location}/processes/{process}/runs/{run}/lineageEvents/{lineage_event}.

Run

A lineage run represents an execution of a process that creates lineage events.

Fields
name

string

Immutable. The resource name of the run. Format: projects/{project}/locations/{location}/processes/{process}/runs/{run}. Can be specified or auto-assigned. {run} must be not longer than 200 characters and only contain characters in a set: a-zA-Z0-9_-:.

display_name

string

Optional. A human-readable name you can set to display in a user interface. Must be not longer than 1024 characters and only contain UTF-8 letters or numbers, spaces or characters like _-:&.

attributes

map<string, Value>

Optional. The attributes of the run. Should only be used for the purpose of non-semantic management (classifying, describing or labeling the run).

Up to 100 attributes are allowed.

start_time

Timestamp

Required. The timestamp of the start of the run.

end_time

Timestamp

Optional. The timestamp of the end of the run.

state

State

Required. The state of the run.

State

The current state of the run.

Enums
UNKNOWN The state is unknown. The true state may be any of the below or a different state that is not supported here explicitly.
STARTED The run is still executing.
COMPLETED The run completed.
FAILED The run failed.
ABORTED The run aborted.

SearchLinksRequest

Request message for SearchLinks.

Fields
parent

string

Required. The project and location you want search in.

page_size

int32

Optional. The maximum number of links to return in a single page of the response. A page may contain fewer links than this value. If unspecified, at most 10 links are returned.

Maximum value is 100; values greater than 100 are reduced to 100.

page_token

string

Optional. The page token received from a previous SearchLinksRequest call. Use it to get the next page.

When requesting subsequent pages of a response, remember that all parameters must match the values you provided in the original request.

Union field criteria. The asset for which you want to retrieve links. criteria can be only one of the following:
source

EntityReference

Optional. Send asset information in the source field to retrieve all links that lead from the specified asset to downstream assets.

target

EntityReference

Optional. Send asset information in the target field to retrieve all links that lead from upstream assets to the specified asset.

SearchLinksResponse

Response message for SearchLinks.

Fields
next_page_token

string

The token to specify as page_token in the subsequent call to get the next page. Omitted if there are no more pages in the response.

UpdateProcessRequest

Request message for UpdateProcess.

Fields
process

Process

Required. The lineage process to update.

The process's name field is used to identify the process to update.

update_mask

FieldMask

The list of fields to update. Currently not used. The whole message is updated.

allow_missing

bool

If set to true and the process is not found, the request inserts it.

UpdateRunRequest

Request message for UpdateRun.

Fields
run

Run

Required. The lineage run to update.

The run's name field is used to identify the run to update.

Format: projects/{project}/locations/{location}/processes/{process}/runs/{run}.

update_mask

FieldMask

The list of fields to update. Currently not used. The whole message is updated.

allow_missing

bool

If set to true and the run is not found, the request creates it.