Connect Microsoft OneDrive with data ingestion

This page describes how to connect Microsoft OneDrive to Agentspace using data ingestion.

Use the following procedure to sync data from OneDrive.

After you set up your data source and import data the first time, the data store syncs data from that source at a frequency that you select during setup.

Limitations

Incremental sync does not detect folder-level actions like Copy, Move, or Rename.

Before you begin

To enforce data source access control and secure data in Agentspace, ensure that you have configured your identity provider.

About Microsoft Entra ID application registration

Before you can create the connector in Agentspace, you must set up a Microsoft Entra ID application registration to enable secure access to OneDrive. How you register the application depends on the authentication method that you select when you're creating the connector in Agentspace. You can choose one of the following methods:

  • Federated credentials:

    • Allows Google to securely access OneDrive using cryptographically signed tokens, avoiding the need for a real user principal.

    • Requires a subject ID to register Agentspace in Entra. This is available when you create the OneDrive connector in Agentspace.

    • When you register your app in Entra ID, you must gather the following details:

      • Instance URI:
        • For all first-level sites: https://DOMAIN_OR_SERVER.onedrive.com —for example, mydomain.onedrive.com.
        • For a single site: https://DOMAIN_OR_SERVER.onedrive.com/[sites/]WEBSITE —for example, mydomain.onedrive.com/sites/sample-site.
      • Tenant ID
      • Client ID

      These details are necessary to complete the authentication and create the OneDrive connector in Agentspace.

    • Google recommends that you use this method.

  • OAuth 2.0 refresh token:

    • Gives a granular control over who connects to the OneDrive API.

    • When you register your app in Entra ID, you must gather the following details:

      • Instance URI: This is in the following form:
        • For all first-level sites: https://DOMAIN_OR_SERVER.onedrive.com —for example, mydomain.onedrive.com.
        • For a single site: https://DOMAIN_OR_SERVER.onedrive.com/[sites/]WEBSITE —for example, mydomain.onedrive.com/sites/sample-site.
      • Tenant ID
      • Client ID
      • Client secret

      These details are necessary to complete the authentication and create the OneDrive data store in Agentspace.

    • The authentication process includes signing in to your OneDrive account.

    • This method is suitable when your OneDrive set up requires a two-factor authentication.

    • Requires you to create a new OneDrive user, which might add licensing costs.

  • OAuth 2.0 password grant:

    • Gives granular control over who connects to the OneDrive API.

    • When you register your app in Entra ID, you must gather the following details:

      • Instance URI:
        • For all first-level sites: https://DOMAIN_OR_SERVER.onedrive.com —for example, mydomain.onedrive.com.
        • For a single site: https://DOMAIN_OR_SERVER.onedrive.com/[sites/]WEBSITE —for example, mydomain.onedrive.com/sites/sample-site.
      • Tenant ID
      • Client ID
      • Client secret

      These details are necessary to complete the authentication and create the OneDrive data store in Agentspace.

    • The authentication process includes providing your Entra ID admin-provided username and password.

    • This method is suitable when your OneDrive setup doesn't require a two-factor authentication.

    • This method requires you to create a new OneDrive user, which might add licensing costs.

Set up federated credentials

Use the following steps to configure the app registration, grant permissions, and establish authentication. Google recommends that you use the federated credentials method.

Some common error messages that you might encounter during this process are listed in Error messages.

  1. Obtain service account client ID:

    1. In the Google Cloud console, go to the Agentspace page.
    2. In the navigation menu, click Data Stores.
    3. Click Create Data Store.
    4. On the Select a data source page, scroll or search for OneDrive to connect your third-party source.
    5. Note the Subject identifier. Don't click Continue yet. Perform the next steps in this task and then complete the steps in the Google Cloud console by following the instructions in Create a OneDrive connector.
  2. Register app in Entra ID:

    1. Navigate to Microsoft Entra administrator center.
    2. In the menu, expand the Applications section and select App registrations.
    3. On the App registrations page, select New registration.

      Register a new app in Entra
      Register a new app in Microsoft Entra admin center

    4. Create an app registration on the Register an application page:

      • In the Supported account types section, select Accounts in the organizational directory only.
      • In the Redirect URI section, select Web and enter the redirect URI https://vertexaisearch.cloud.google.com/console/oauth/onedrive_oauth.html
      • Keep the default values for the other settings and click Register.

      Register Accounts in the organizational directory only
      Select the account type and enter the redirect URI

    5. Note the Client ID and Tenant ID.

      App details page summary
      App details page

  3. Add federated credentials:

    1. Go to Certificates & secrets > Federated credentials > Add credential.

      Add federated credentials in Entra
      Add federated credentials in Microsoft Entra

    2. Use the following settings:

      • Federated credential scenario: Other issuer

      • Issuer: https://accounts.google.com

      • Subject identifier: Use the value of Subject identifier that you noted in the previous step.

      • Name: Provide a unique name.

    3. Click Add to grant access.

      Connect your Google Account to Microsoft Entra ID
      Connect your Google Account to Microsoft Entra ID

  4. Set API permissions.

    Select the app to set API permissions
    Select the app to set API permissions

    1. Add and grant the following Microsoft Graph permissions. You can choose between the site control options (Sites.FullControl.All and Sites.Selected) and profile reading options (User.Read.All and User.ReadBasic.All):

      Microsoft Graph permissions for federated credentials

      Permission Type Description Justification
      GroupMember.Read.All Application Read all group memberships This permission allows Agentspace to understand the memberships of the user groups in the OneDrive site.
      User.Read Delegated Sign in and read user's profile

      This is a default permission that must not be removed. When removed, OneDrive displays an error asking you to reinstate this permission.

      Files.Read.All Application Read files in all site collections

      This permission allows Agentspace to read all files in all site collections.

      Site control options
      Option 1: Sites.FullControl.All Application Full control over all sites

      This permission allows Agentspace to obtain the OneDrive user groups and role assignments, which aren't included in the Sites.Read.All permission. It also allows Agentspace to index documents, events, comments, attachments, and files across all OneDrive sites.

      If giving full control over all sites seems excessive, use Option 2: Sites.Selected to give granular control.

      Option 2: Sites.Selected Application Control over selected sites

      This permission allows Agentspace to obtain the OneDrive user groups and role assignments, which aren't included in the Sites.Read.All permission. It also allows Agentspace to index documents, events, comments, attachments, and files across selected OneDrive sites. This permission provides more granular control instead of Sites.FullControl.All

      Profile reading options
      Option 1: User.Read.All Application Read all users' full profiles This permission allows Agentspace to understand the data access control for your OneDrive content.
      Option 2: User.ReadBasic.All Application Read all users' basic profiles This permission allows Agentspace to understand the data access control for your OneDrive content.
    2. Add and grant the following OneDrive permissions. You can choose between Sites.FullControl.All and Sites.Selected:

      OneDrive permissions for federated credentials

      Permission Type Description Justification
      Option 1: Sites.FullControl.All Application Full control over all sites

      This permission allows Agentspace to obtain the OneDrive user groups and role assignments, which aren't included in the Sites.Read.All permission. It also allows Agentspace to index documents, events, comments, attachments, and files across all OneDrive sites.

      If giving full control over all sites seems excessive, use Option 2: Sites.Selected to give granular control.

      Option 2: Sites.Selected Application Control over selected sites This permission allows Agentspace to obtain the OneDrive user groups and role assignments, which aren't included in the Sites.Read.All permission. It also allows Agentspace to index documents, events, comments, attachments, and files across selected OneDrive sites.
    3. For the added permissions, check that the Status column lists the permission as Granted and has a green check icon.

  5. Grant administrator consent. For information about how to grant consent, see Grant tenant-wide administrator consent to an application in the Microsoft Entra documentation.

Set up OAuth 2.0 for refresh token and password grant

You can use the OAuth 2.0 method to set up an Entra ID application registration and enable secure access to OneDrive. This method includes steps to configure the app registration, grant permissions, and establish authentication.

You can use the following process to register the application in Entra ID using OAuth 2.0 authentication for refresh token and for password grant. This method is preferred when you need granular control over OneDrive REST API permissions, allowing you to restrict resource access on the user account.

Some common error messages that you might encounter during this process are listed in Error messages.

The following table describes the OneDrive roles that are recommended for OAuth 2.0 authentication methods:

  1. Create app registration:

    1. Navigate to Entra ID administrator center.

    2. Create an app registration:

      • Supported account types: Accounts in the organizational directory only.
      • Redirect URI: https://vertexaisearch.cloud.google.com/console/oauth/onedrive_oauth.html.
    3. Note the Client ID and Tenant ID.

  2. Add client secret:

    1. Go to Certificates & secrets > New client secret.
    2. Note the secret string.
  3. Set API permissions.

    1. Add and grant the following Microsoft Graph permissions. You can choose between Sites.FullControl.All and Sites.Selected:

      Microsoft Graph permissions for OAuth 2.0 authentication

      Permission Type Description Justification
      GroupMember.Read.All Application Read all group memberships This permission allows Agentspace to understand the memberships of the user groups in the OneDrive site.
      User.Read Delegated Sign in and read user's profile

      This is a default permission that must not be removed. When removed, OneDrive displays an error asking you to reinstate this permission.

      Option 1: Sites.FullControl.All Application Full control over all sites

      This permission allows Agentspace to obtain the OneDrive user groups and role assignments, which aren't included in the Sites.Read.All permission. It also allows Agentspace to index documents, events, comments, attachments, and files across all OneDrive sites.

      Option 2: Sites.Selected Application Control over selected sites

      This permission allows Agentspace to obtain the OneDrive user groups and role assignments, which aren't included in the Sites.Read.All permission. It also allows Agentspace to index documents, events, comments, attachments, and files across selected OneDrive sites. This permission provides more granular control instead of Sites.FullControl.All

      User.Read.All Application Read all users' full profiles This permission allows Agentspace to understand the data access control for your OneDrive content.
    2. Add and grant the following OneDrive permissions for OAuth 2.0 authentication. You can choose between AllSites.FullControl and Sites.Selected:

      OneDrive permissions for OAuth 2.0 authentication

      Permission Type Description Justification
      Option 1: AllSites.FullControl Delegated Full control over all sites

      This permission allows Agentspace to obtain the OneDrive user groups and role assignments, which aren't included in the Sites.Read.All permission. It also allows Agentspace to index documents, events, comments, attachments, and files across all OneDrive sites.

      Option 2: Sites.Selected Delegated Control over selected sites

      This permission allows Agentspace to obtain the OneDrive user groups and role assignments, which aren't included in the Sites.Read.All permission. It also allows Agentspace to index documents, events, comments, attachments, and files across selected OneDrive sites. This permission provides more granular control instead of AllSites.FullControl.

    3. For the added permissions, check that the Status column lists the permission as Granted and has a green check icon.

      Verify the API permissions
      Verify the API permissions

    4. Use a dedicated user account with limited access to specific sites. Verify that this account has Owner access to the selected sites.

  4. Grant administrator consent. For information about how to grant consent, see Grant tenant-wide administrator consent to an application in the Microsoft Entra ID documentation.

Error messages

The following table describes the common error messages and their descriptions that you might encounter when connecting OneDrive with Agentspace.

Error code Error message
ONEDRIVE_MISSING_PERMISSION_1 Missing required REST API role (Sites.FullControl.All or Sites.Selected). For delegated permissions, missing AllSites.FullControl or Sites.Selected.
ONEDRIVE_MISSING_PERMISSION_2 Missing required Graph API role (Sites.FullControl.All or Sites.Selected).
ONEDRIVE_MISSING_PERMISSION_3 Missing required Graph API role GroupMember.Read.All.
ONEDRIVE_MISSING_PERMISSION_4 Missing required Graph API role (User.Read.All or User.ReadBasic.All).
ONEDRIVE_INVALID_SITE_URI Failed to retrieve Graph API access token. Possible causes: invalid client ID, secret value, or missing federated credentials.
ONEDRIVE_INVALID_AUTH Failed to retrieve Graph API access token. Possible causes: invalid client ID, secret value, or missing federated credentials.
ONEDRIVE_INVALID_JSON Failed to parse JSON content.
ONEDRIVE_TOO_MANY_REQUESTS Too many HTTP requests sent to OneDrive; received 429 HTTP response.
  1. Manifest file:

    1. Go to the Manifest tab.
    2. Delete the contents between [ and ] under requiredResourceAccess.

      manifest-file
      Edit the manifest file

    3. Paste the following JSON between the brackets.

      {
       "resourceAppId": "00000003-0000-0000-c000-000000000000",
       "resourceAccess": [
         {
           "id": "01d4889c-1287-42c6-ac1f-5d1e02578ef6",
           "type": "Role"
         },
         {
           "id": "5b567255-7703-4780-807c-7be8301ae99b",
           "type": "Role"
         },
         {
           "id": "df021288-bdef-4463-88db-98f22de89214",
           "type": "Role"
         }
       ]
      }
      
    4. Return to API permissions.

    5. Confirm all required permissions are present.

    6. Grant administrator consent.

Create a OneDrive connector

Console

To use the Google Cloud console to sync data from OneDrive to Agentspace, follow these steps:

  1. In the Google Cloud console, go to the Agentspace page.

    Agentspace

  2. In the navigation menu, click Data Stores.

  3. Click Create Data Store.

  4. On the Select a data source page, scroll or search for OneDrive to connect your third-party source.

  5. Under Authentication settings, select the authentication method to use.

    1. Enter your authentication information.

    2. Click Continue.

      onedrive-ingestion-auth-methods
      Select the authentication method and provide your authentication information.

  6. Select the following entities to sync:

    • File
  7. To filter entities out of the index or ensure that they are included in the index, click Filter.

    • fileName matches the filename only.
    • filePath must be a full Microsoft Graph API path, usually prefixed with /drive/root:. For example, if the OneDrive direct link is https:/example-my.onedrive.com/personal/user_example_com/Documents/folder1/folder2, then filePath is /drive/root:/folder1/folder2.

    onedrive-filters
    Specify filters to include or exclude entities

  8. Click Continue.

  9. Select the Sync frequency for Full sync and the Incremental sync frequency for Incremental data sync. For more information, see Sync schedules.

    If you want to schedule separate full syncs of entity and identity data, expand the menu under Full sync and then select Custom options.

    Custom options for full data sync.
    Setting separate schedules for full entity sync and full identity sync.
  10. Select a region for your data store.

  11. Enter a name for your data store.

  12. Click Create. Agentspace creates your data store and displays your data stores on the Data Stores page.

  13. To check the status of your ingestion, go to the Data Stores page and click your data store name to see details about it on its Data page. The Connector state changes from Creating to Running when it starts synchronizing data. When ingestion is complete, the state changes to Active to indicate that the connection to your data source is set up and awaiting the next scheduled synchronization.

    Depending on the size of your data, ingestion can take minutes or hours.

Enable real-time sync

To enable real-time sync for your data store, follow these steps.

  1. In the Google Cloud console, go to the Agentspace page.

    Agentspace

  2. In the navigation menu, click Data Stores.

  3. Click the name of the OneDrive data store for which you want to enable real-time sync.

  4. On the data store Data page, wait until the Connector state changes to Active.

  5. In the Real-time sync field, click View/edit.

    View and edit real-time sync settings
    View and edit real-time sync settings.

  6. Click the Enable real-time sync toggle to the on position.

  7. Provide a value for Client secret. This value is used to verify OneDrive webhook events. We recommend using a string of 20 characters.

    Enable real-time sync and provide a client secret
    Enable real-time sync and provide a client secret.

  8. Click Save.

    Wait for the Real-time sync field to change to Running.

Error codes

The following table lists OneDrive error codes and descriptions.

Error code Description
ONEDRIVE_MISSING_PERMISSION_1 The application does not have a required Files.Read.All role for Graph API.
ONEDRIVE_MISSING_PERMISSION_2 The application does not have a required Group.Read.All role for Graph API.
ONEDRIVE_MISSING_PERMISSION_3 The application does not have a required User.Read.All role or User.ReadBasic.All role for Graph API.
ONEDRIVE_INVALID_SITE_URI The instance URL is invalid.
ONEDRIVE_INVALID_AUTH Error when retrieving Graph API access token. This may be due to an invalid client id, secret value, or missing federated credentials.
ONEDRIVE_UNCATEGORIZED_ERROR Invalid or no ACL is present in file.
ONEDRIVE_TOO_MANY_REQUESTS Too many HTTP requests are sent to OneDrive. Received HTTP 429 response.

Next steps

  • To attach your data store to an app, create an app and select your data store following the steps in Create an app.

  • To preview how your search results appear after your app and data store are set up, see Preview search results.