Connect to a third-party Git repository

Stay organized with collections Save and categorize content based on your preferences.

This document shows you how to link a Dataform repository to a remote Git repository.

After you link the repositories, the changes you make in a Dataform development workspace can be pushed to and pulled from the remote Git repository.

You can link a Dataform repository to a repository hosted by the following Git providers:

  • GitHub

  • GitLab

Before you begin

If you haven't done so already, create a Dataform repository. You need it later to share a secret with your Dataform service account.

Required roles

To get the permissions that you need to link a Dataform repository to a remote Git repository, ask your administrator to grant you the Dataform Admin (roles/dataform.admin) IAM role on repositories. For more information about granting roles, see Manage access.

You might also be able to get the required permissions through custom roles or other predefined roles.

Create and share a secret

You need to create a personal access token with your Git provider to create a Secret Manager secret and share the secret with your Dataform service account.

To connect to a GitHub repository, create a classic personal access token or a fine-grained personal access token that lets you customize token permissions. To connect to a GitLab repository, create a classic personal access token.

After creating a personal access token, you need to create a secret in Secret Manager that contains the token. Then, you need to grant secret access to your Dataform service account.

Dataform then uses the access token to log in to your Git provider to commit changes on behalf of the developers. Dataform makes these commits using the developer's Google Cloud email address so you can tell which developer made each commit.

To create and share a secret for connecting a Dataform repository to a GitHub or a GitLab repository, follow these steps:

  1. In GitHub or GitLab, create a personal access token.

  2. When you create a GitHub personal access token, do the following:

    1. Grant Dataform the repo permission.

    2. Make sure to set a token expiration time appropriate to your needs.

    3. If your organization uses SAML single sign-on (SSO), authorize the token.

  3. [Optional] When you create a GitHub fine-grained personal access token, do the following:

    1. Select repository access to only selected repositories, then select the repository that you want to connect to.

    2. Grant read and write access on contents of the repository.

    3. Make sure to set a token expiration time appropriate to your needs.

    4. If your organization uses SAML single sign-on (SSO), authorize the token.

  4. When you create a GitLab personal access token, do the following:

    1. Name the token dataform.

      The GitLab personal access token must be named dataform.

    2. Grant Dataform the api, read_repository, and write_repository permissions.

    3. Make sure to set a token expiration time appropriate to your needs.

  5. In Secret Manager, create a secret containing a personal access token for connecting to your Git provider.

  6. Grant access to the secret to your Dataform service account.

    Your Dataform service account is in the following format:

    service-PROJECT_NUMBER@gcp-sa-dataform.iam.gserviceaccount.com
    
    1. When granting access, make sure to grant the roles/secretmanager.secretAccessor role to your Dataform service account.

Connect a Dataform repository

To link a Dataform repository to a remote Git repository, follow these steps:

  1. In the Google Cloud console, go to the Dataform page.

    Go to Dataform

  2. Select the repository you want to connect.

  3. On the repository page, click Settings > Connect with Git.

  4. In the Link to remote repository pane, define the following options:

    1. In the Remote Git repository URL field, enter the URL of the remote Git repository, ending with .git.

    2. In the Default remote branch name field, enter the name of the main development branch of the remote Git repository.

    3. In the Secret drop-down, select your secret for the remote Git repository.

  5. Click Link.

Edit the remote repository connection

To edit a connection between a Dataform repository and a remote Git repository, follow these steps:

  1. In the Google Cloud console, go to the Dataform page.

    Go to Dataform

  2. Click the repository that you want to edit.

  3. On the repository page, click Settings > Edit Git connection.

  4. On the Link to remote repository pane, edit any of the following options:

    1. In the Remote Git repository URL field, edit the URL of the linked remote Git repository.

    2. In the Default remote branch name field, edit the name of the main development branch of the remote Git repository.

    3. In the Secret drop-down, select your secret for the remote Git repository.

  5. Click Update.

What's next