Connect a third-party data source

This page describes how to connect third-party data sources to Vertex AI Search.

When you connect a third-party data source, Vertex AI Search creates a data connector, and associates data stores (called entity data stores) with it for the entities that you specify. Entity types are specific to the data source that you're connecting to. For example, Jira Cloud entities include issues, attachments, comments, and worklogs.

Third-party data sources are available only for generic search apps. Chat, recommendations, and agent apps can't use third-party data sources.

Third-party connectors are not CMEK-compliant.

To import data from a Google data source instead, see Create a search data store.

Before you begin

  1. Contact your Google account team and ask to be added to the allowlist for third-party data source connectors.

  2. Go to the section for the source you plan to use:

Connect Confluence Cloud

Use the following procedure to sync data from Confluence Cloud to Vertex AI Search.

After you set up your data source and import data the first time, the data store syncs data from that source at a frequency that you select during setup.

Before setting up your connection:

  • Set up access control for your data source. For information about setting up access control, see Use data source access control.

  • Have the following authentication information ready:

    • Instance URL. In the form https://EXAMPLE.atlassian.net—for example, https://google.atlassian.net.

    • Instance ID. This is the cloudid and can be found by going to https://EXAMPLE.atlassian.net/_edge/tenant_info and copying the value of cloudId.

  • Enable OAuth 2.0 and get the client ID and client secret.

    Use https://vertexaisearch.cloud.google.com/console/oauth/confluence_oauth.html as the callback URL. For information about enabling OAuth 2.0 for Confluence Cloud and getting the client ID and client secret, see OAuth 2.0 (3LO) apps in the Atlassian Developer documentation.

  • When setting OAuth 2.0 permission scopes, configure the following scopes:

  • For user permissions to apply correctly, each Confluence Cloud user must make their email visible to all users. To do so, change the email visibility settings in Confluence Cloud and set the visibility to Anyone. For more information, see Set your email visibility in the Atlassian documentation.

Console

To use the Google Cloud console to sync data from Confluence Cloud to Vertex AI Search, follow these steps:

  1. In the Google Cloud console, go to the Agent Builder page.

    Agent Builder

  2. In the navigation menu, click Data Stores.

  3. Click New data store.

  4. On the Select a data source page, go to the Third-party sources section and select Confluence.

  5. Enter your authentication information and click Authenticate.

  6. A new window appears. Enter the instance username and password. Check that the authentication succeeded before returning to the Specify the Confluence source for your data store page.

  7. Select which entities to sync, then click Continue.

  8. Select a region for your data connector.

  9. Enter a name for your data connector.

  10. Select a synchronization frequency.

  11. Click Create. Vertex AI Search creates your data store and displays your data stores on the Data Stores page.

  12. To check the status of your ingestion, go to the Data Stores page and click your data store name to see details about it on its Data page. The Connector state changes from Creating to Running when it starts synchronizing data. When ingestion is complete, the state changes to Active to indicate that the connection to your data source is set up and awaiting the next scheduled synchronization.

    Depending on the size of your data, ingestion can take several minutes or several hours.

Next steps

Connect Jira Cloud

Use the following procedure to sync data from Jira Cloud to Vertex AI Search.

After you set up your data source and import data the first time, you can choose how often the data store syncs with that source.

Before you begin

Before setting up your connection:

  • Set up access control. Ensure that access control is properly configured for your data source. This step ensures that only authorized users can access and manage the data. For more information, see Use data source access control documentation.
  • For user permissions to apply correctly, Jira Cloud users must provide sharing consent.
  • Make sure that you have an Atlassian account, Jira instance, and project.
  • Verify that you have administrator access to the Jira instance, and project.

Set up authentication and permissions in Jira

Using the instructions in the following sections, ensure you have the necessary authentication details and admin access to your Jira instance. Create a Client ID and Client Secret through the Atlassian Developer Console, configure the required OAuth 2.0 scopes, and set up permissions for users. Finally, retrieve your instance URL and ID, configure roles, and authenticate to sync data between Jira Cloud and Vertex AI Search.

Create client ID and client secret

NOTE: To enable OAuth 2.0 and obtain the client ID and secret, see OAuth 2.0 (3LO) apps in the Atlassian Developer documentation.

  1. Sign in to developer.atlassian.com.
  2. Click the profile icon in the top right corner and select Developer Console.
  3. Click Create and select OAuth 2.0 Integration.
  4. Enter a name for the app.
    • Check the terms and conditions checkbox.
    • Click Create.
  5. You will find five options: Overview, Distribution, Permissions, Authorization, and Settings. Start with Authorization:

    a. Click Authorization.

    b. In the Authorization type table, select Add for OAuth 2.0 (3LO).

  6. In the Callback URL field, enter https://vertexaisearch.cloud.google.com/console/oauth/jira_oauth.html, and then click Save changes.

    NOTE: If you see the warning: Your app doesn't have any APIs. Add APIs to your app., this will be addressed in the next section.

  7. Select Permissions:

    a. Go to Jira API, click Add, then click Configure.

    NOTE: When you click Add, the button changes to Configure.

    b. Go to the Classic scopes tab and click Edit Scopes. Select the following scopes:

    Confirm that 7 scopes are selected, then save your changes.

  8. Click Distribution, select Edit, and do the following:

    • Select the Sharing radio button first to enable editing other fields.
    • Fill out the remaining fields.
    • Select Yes when asked: Does your app store personal data?
  9. Select Settings to copy your Client ID and Client Secret.

Retrieve instance URL and instance ID

To get the instance URL:

  1. Go to atlassian.net and sign in with your admin account.
  2. Select the app you want to sync. For example, sync the first app.
  3. Find the instance URL, which is the subdomain in the address bar. It will look like: Assign appropriate permissions.

To get the instance ID:

  1. Open a new tab, copy the instance URL, and append /_edge/tenant_info to the instance URL. For example, https://YOUR-INSTANCE.atlassian.net/_edge/tenant_info.
  2. Navigate to the link to find the cloudId value. The cloudId is your Instance ID.

Set up permissions and roles

  1. Sign in to atlassian.com with your admin account.
  2. Click the menu icon on the top left or go to admin.atlassian.com.
  3. On the Admin page, click Manage users and go to the Groups page.

  4. Click Create group. Enter a name for the group and create it.

  5. In the Group product access section of your new group's page, click Add products to group.

  6. For Jira, select User access admin as the product role.

  7. For Jira Admin, select Product admin as the product role and save your changes.

  8. On the Groups page, click Add group members and add users or accounts that the connector will authenticate as.

Create a Jira Cloud connector

Console

To use the Google Cloud console to sync data from Jira Cloud to Vertex AI Search, follow these steps:

  1. In the Google Cloud console, go to the Agent Builder page.

    Agent Builder

  2. In the navigation menu, click Data Stores.

  3. Click Create data store.

  4. On the Select a data source page, go to the Third-party sources section and select Jira.

  5. Enter your authentication information and click Authenticate.

  6. A new window appears. Enter the instance username and password. Check that the authentication succeeded before returning to the Specify the Jira source for your data store page.

  7. Select which entities to sync, then click Continue.

  8. Select a region for your data store.

  9. Enter a name for your data store.

  10. Select a synchronization frequency.

  11. Click Create. Vertex AI Search creates your data store and displays your data stores on the Data Stores page.

  12. To check the status of your ingestion, go to the Data Stores page and click your data store name to see details about it on its Data page. The Connector state changes from Creating to Running when it starts synchronizing data. When ingestion is complete, the state changes to Active to indicate that the connection to your data source is set up and awaiting the next scheduled synchronization.

    Depending on the size of your data, ingestion can take several minutes or several hours.

Next steps

Connect Salesforce

Use the following procedure to sync data from Salesforce to Vertex AI Search.

After you set up your data source and import data the first time, the data store syncs data from that source at a frequency that you select during setup.

Before setting up your connection:

The following limitation applies:

  • To sync a user as an entity, the user must provide sharing consent.

Console

To use the Google Cloud console to sync data from Salesforce to Vertex AI Search, follow these steps:

  1. Add Google Cloud to your Salesforce CORS allowlist. If you have already done this, skip to the next step.

    1. Follow the instructions in the Salesforce documentation to configure the CORS allowlist.

    2. Enter https://console.cloud.google.com/ as an origin URL and save your configuration.

  2. In the Google Cloud console, go to the Agent Builder page.

    Agent Builder

  3. In the navigation menu, click Data Stores.

  4. Click Create data store.

  5. On the Select a data source page, go to the Third-party sources section and select Salesforce.

  6. Enter your Salesforce authentication information.

  7. Select which entities to sync and click Continue.

  8. Select a region for your data store.

  9. Enter a name for your data store.

  10. Select a synchronization frequency.

  11. Click Create. Vertex AI Search creates your data store and displays your data stores on the Data Stores page.

  12. To check the status of your ingestion, go to the Data Stores page and click your data store name to see details about it on its Data page. The Connector state changes from Creating to Running when it starts synchronizing data. When ingestion is complete, the state changes to Active to indicate that the connection to your data source is set up and awaiting the next scheduled synchronization.

    Depending on the size of your data, ingestion can take several minutes or several hours.

Next steps

Connect ServiceNow

Use the following procedure to sync data from ServiceNow to Vertex AI Search.

After you set up your data source and import data the first time, the data store syncs data from that source at a frequency that you select during setup.

Before setting up your connection:

  • Set up access control for your data source. For information about setting up access control, see Use data source access control.

  • Have the following authentication information ready:

    • Instance URL in the form of https://<domain-name>.service-now.com/.
    • Client ID and client secret. For information about endpoint setup and getting the client ID and client secret, see Create an endpoint for clients to access the instance in the ServiceNow documentation.
    • Username and password for one of the following ServiceNow role types:

      • Administrator role. See Base system roles in the ServiceNow documentation.
      • A custom ServiceNow role. This is an alternative to using an administrator role. To use a custom ServiceNow role, create access control rules with the following fields:

        For more information, see Create a role and Create an ACL rule in the ServiceNow documentation.

Console

To use the Google Cloud console to sync data from ServiceNow to Vertex AI Search, follow these steps:

  1. In the Google Cloud console, go to the Agent Builder page.

    Agent Builder

  2. In the navigation menu, click Data Stores.

  3. Click Create data store.

  4. On the Select a data source page, go to the Third-party sources section and select ServiceNow.

  5. Enter your ServiceNow authentication information.

  6. Select which entities to sync and click Continue.

  7. Select a region for your data connector.

  8. Enter a name for your data connector.

  9. Select a synchronization frequency.

  10. Click Create. Vertex AI Search creates your data store and displays your data stores on the Data Stores page.

  11. To check the status of your ingestion, go to the Data Stores page and click your data connector name to see details about it on its Data page. The Connector state changes from Creating to Running when it starts synchronizing data. When ingestion is complete, the state changes to Active to indicate that the connection to your data source is set up and awaiting the next scheduled synchronization.

    Depending on the size of your data, ingestion can take several minutes or several hours.

Next steps

Connect Sharepoint Online

Use the following procedure to sync data from Sharepoint Online to Vertex AI Search.

After you set up your data source and import data the first time, the data store syncs data from that source at a frequency that you select during setup.

Before setting up your connection:

  • Set up access control for your data source. For information about setting up access control, see Use data source access control.

  • Two-factor authentication (2FA) must be turned off for the Sharepoint account. Only basic Oauth2 password authentication is supported.

  • Grant administrator consent. For information about how to grant consent, see Grant tenant-wide admin consent to an application in the Microsoft documentation.

  • Prepare the following Sharepoint Online authentication information to use during setup:

    • Instance URL. In the form http://DOMAIN_OR_SERVER/[sites/]WEBSITE. For more information about URLs, see URLs and tokens in SharePoint in the Sharepoint documentation.
    • Tenant ID, client ID, and client secret. To register the application, select Accounts in this organizational directory only for the sign-in audience, and then locate this authentication information. For more information, see Quickstart: Register an application with the Microsoft identity platform in the Microsoft documentation.
    • Username and password. These must correspond to either a Sharepoint Site Admin or a Sharepoint Site Collection Admin with 2FA disabled.
  • The following table describes the roles that are recommended for configuration and their limitations.

Console

To use the Google Cloud console to sync data from Sharepoint Online to Vertex AI Search, follow these steps:

  1. In the Google Cloud console, go to the Agent Builder page.

    Agent Builder

  2. In the navigation menu, click Data Stores.

  3. Click Create data store.

  4. On the Select a data source page, go to the Third-party sources section and select Sharepoint Online.

  5. Enter your Sharepoint Online authentication information.

  6. Select the entities to sync and click Continue.

  7. Select a region for your data store.

  8. Enter a name for your data store.

  9. Select a synchronization frequency for your data store.

  10. Click Create. Vertex AI Search creates your data store and displays your data stores on the Data Stores page.

  11. To check the status of your ingestion, go to the Data Stores page and click your data store name to see details about it on its Data page. The Connector state changes from Creating to Running when it starts synchronizing data. When ingestion is complete, the state changes to Active to indicate that the connection to your data source is set up and awaiting the next scheduled synchronization.

    Depending on the size of your data, ingestion can take several minutes or several hours.

Next steps

Connect Slack

Use the following procedure to sync data from Slack to Vertex AI Search.

After you set up your data source and import data the first time, the data store syncs data from that source at a frequency that you select during setup.

Before setting up your connection:

  • Set up access control for your data source. For information about setting up access control, see Use data source access control.

  • Have the following Slack authentication information ready:

    • Workspace ID. For information about getting workspace IDs, see Specify the Slack source for your data store in the Slack documentation.
    • Access token. For information about creating a client app and defining scopes, see Quickstart and How to quickly get and use a Slack API token in the Slack documentation.
    • When setting OAuth 2.0 permission scopes, configure the following scopes:

The following limitation applies:

  • Slack's default behavior restricts the crawling and syncing of content from private channels, multi-party instant messages, and 1:1 instant messages.

Console

To use the Google Cloud console to sync data from Slack to Vertex AI Search, follow these steps:

  1. In the Google Cloud console, go to the Agent Builder page.

    Agent Builder

  2. In the navigation menu, click Data Stores.

  3. Click Create data store.

  4. On the Select a data source page, go to the Third-party sources section and select Slack.

  5. Enter your Slack authentication information.

  6. Select which entities to sync and click Continue.

  7. Select a region for your data store.

  8. Enter a name for your data store.

  9. Select a synchronization frequency for your data store.

  10. Click Create. Vertex AI Search creates your data store and displays your data stores on the Data Stores page.

  11. To check the status of your ingestion, go to the Data Stores page and click your data store name to see details about it on its Data page. The Connector state changes from Creating to Running when it starts synchronizing data. When ingestion is complete, the state changes to Active to indicate that the connection to your data source is set up and awaiting the next scheduled synchronization.

    Depending on the size of your data, ingestion can take several minutes or several hours.

Next steps

Connect Dropbox

Use the following procedure to sync data from Dropbox to Vertex AI Search.

After you set up your data source and import data the first time, the data store syncs data from that source at a frequency that you select during setup.

Before you begin

Before setting up your connection:

  • Set up access control for your data source. For more information, see Use data source access control.

  • Have the following Dropbox authentication information ready. For information about setting up these parameters, see the OAuth Guide in the Dropbox documentation.

    • Client ID
    • Client secret

Console

To use the Google Cloud console to sync data from Dropbox to Vertex AI Search, follow these steps:

  1. In the Google Cloud console, go to the Agent Builder page.

    Agent Builder

  2. In the navigation menu, click Data Stores.

  3. Click Create data store.

  4. On the Select a data source page, go to the Third-party sources section and select Dropbox.

  5. Enter your Dropbox authentication information and click Authenticate. A new window appears.

  6. Authenticate your account and confirm that it succeeded before returning to the Specify the Dropbox source for your data store page.

  7. Select which entities to sync and click Continue.

  8. Select a location for your data store.

  9. Enter a name for your data store.

  10. Select a synchronization frequency for your data store.

  11. Click Create. Vertex AI Search creates your data store and displays your data stores on the Data stores page.

  12. To check the status of your ingestion, go to the Data stores page and click your data store name to see details about it on its Data page. The Connector state changes from Creating to Running when it starts synchronizing data. When ingestion is complete, the state changes to Active to indicate that the connection to your data source is set up and awaiting the next scheduled synchronization. Check the Documents tab to make sure your entities have been ingested correctly.

    Depending on the size of your data, ingestion can take several minutes or several hours.

Next steps

Connect Box

Use the following procedure to sync data from Box to Vertex AI Search.

After you set up your data source and import data the first time, the data store syncs data from that source at a frequency that you select during setup.

Before you begin

Before setting up your connection:

  • Set up access control for your data source. For information about setting up access control, see Use data source access control.

  • Have the following Box authentication information ready. For information about setting up these parameters, see JWT Auth in the Box developer documentation.

    • Enterprise ID
    • Client ID
    • Client secret
    • Public key ID
    • Private key
    • Passphrase
  • When creating the JWT endpoint, configure the following scopes:

The following limitations apply:

  • Incremental changes for comments might take longer to sync than the configured frequency interval.
  • If a folder containing an entity is copied or moved, then incremental changes might take longer to sync than the configured frequency interval.

Console

To use the Google Cloud console to sync data from Box to Vertex AI Search, follow these steps:

  1. In the Google Cloud console, go to the Agent Builder page.

    Agent Builder

  2. In the navigation menu, click Data stores.

  3. Click Create data store.

  4. On the Select a data source page, go to the Third-party sources section and select Box.

  5. Enter your authentication information.

  6. Select which entities to sync and click Continue.

  7. Select a region for your data store.

  8. Enter a name for your data store.

  9. Select a synchronization frequency for your data store.

  10. Click Create. Vertex AI Search creates your data store and displays your data stores on the Data stores page.

  11. To check the status of your ingestion, go to the Data stores page and click your data store name to see details about it on its Data page. The Connector state changes from Creating to Running when it starts synchronizing data. When ingestion is complete, the state changes to Active to indicate that the connection to your data source is set up and awaiting the next scheduled synchronization.

    Depending on the size of your data, ingestion can take several minutes or several hours.

Next steps

Connect OneDrive

Use the following procedure to sync data from OneDrive to Vertex AI Search.

After you set up your data source and import data the first time, the data store syncs data from that source at a frequency that you select during setup.

Before you begin

Before setting up your connection:

Console

To use the Google Cloud console to sync data from OneDrive to Vertex AI Search, follow these steps:

  1. In the Google Cloud console, go to the Agent Builder page.

    Agent Builder

  2. In the navigation menu, click Data stores.

  3. Click Create data store.

  4. On the Select a data source page, go to the Third-party sources section and select OneDrive.

  5. Enter your OneDrive authentication information.

  6. Select which entities to sync and cick Continue.

  7. Select a region for your data store.

  8. Enter a name for your data store.

  9. Select a synchronization frequency for your data store.

  10. Click Create. Vertex AI Search creates your data store and displays your data stores on the Data stores page.

  11. To check the status of your ingestion, go to the Data stores page and click your data store name to see details about it on its Data page. The Connector state changes from Creating to Running when it starts synchronizing data. When ingestion is complete, the state changes to Active to indicate that the connection to your data source is set up and awaiting the next scheduled synchronization.

    Depending on the size of your data, ingestion can take several minutes or several hours.

Next steps