Use Dataplex Universal Catalog with MCP, Gemini, and other agents

This page explains how to connect your Dataplex Universal Catalog instance to developer tools such as the Gemini CLI. Connecting Dataplex Universal Catalog to these tools enables AI-driven data discovery and asset management directly within your IDE.

For an integrated command-line experience, we recommend using the dedicated Dataplex Universal Catalog extension for Gemini CLI. The extension bundles the underlying Model Context Protocol (MCP) server, which acts as an intermediary between Gemini CLI and Dataplex Universal Catalog, removing the need for a separate server setup.

Alternatively, you can connect other IDEs and developer tools that support MCP by using the general-purpose MCP Toolbox for Databases. You can then use AI agents in your existing IDE to discover data assets in Dataplex Universal Catalog. For more information about MCP, see Introduction to Model Context Protocol.

This guide demonstrates the connection process for the following tools:

About Gemini CLI and extensions

Gemini CLI is an open-source conversational AI agent from Google that accelerates development workflows and assists with coding, debugging, data exploration, and content creation. It offers an agent-driven experience to interact with Data Cloud services, such as Dataplex Universal Catalog, and other popular open-source databases.

For more information about Gemini CLI, see the Gemini CLI documentation.

How extensions work

Extensions expand the capabilities of Gemini CLI, letting it connect to and control specific Google Cloud services and other tools. They provide Gemini with context and API understanding, enabling conversational interaction. You can load Gemini CLI extensions from GitHub URLs, local directories, or registries. These extensions offer new tools, slash commands, and prompts. These are separate from IDE extensions, such as Gemini Code Assist, which integrate using the MCP Toolbox.

About the Dataplex Universal Catalog extension

MCP Toolbox for Databases is in beta (pre-v1.0), and might see breaking changes until the first stable release (v1.0).

The Dataplex Universal Catalog extension for Gemini CLI integrates AI into your data governance and discovery tasks. You can interact with Dataplex Universal Catalog using natural language prompts in your terminal. Here are some examples:

Category Tool Example natural language prompt
Data discovery and governance dataplex_search_entries
  • Find all datasets related to sales in Europe.
  • Show me tables that contain customer PII.
  • List all BigQuery datasets in the 'marketing' lake in Dataplex Universal Catalog.
dataplex_lookup_entry
  • What's the schema of the 'orders' table?
  • Describe the data quality rules applied to the customer database.
  • Who is listed as the business owner for the `customer_details` table?
dataplex_search_aspect_types
  • Show me aspect types related to data quality rules.
  • List all aspect types used for data governance.
  • Are there any aspect types for marking PII data?

For more information about the Dataplex Universal Catalog extension, see the Gemini CLI Extension - Dataplex Universal Catalog.

Required roles and permissions

To get the permissions that you need to connect to Dataplex Universal Catalog using MCP Toolbox or the Gemini CLI extension, ask your administrator to grant you the following IAM roles on your project:

For more information about granting roles, see Manage access to projects, folders, and organizations.

These predefined roles contain the permissions required to connect to Dataplex Universal Catalog using MCP Toolbox or the Gemini CLI extension. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to connect to Dataplex Universal Catalog using MCP Toolbox or the Gemini CLI extension:

  • To enable APIs: serviceusage.services.enable
  • To use Dataplex Universal Catalog tools:
    • dataplex.projects.search
    • dataplex.entries.get
    • dataplex.aspectTypes.get
    • dataplex.aspectTypes.list

You might also be able to get these permissions with custom roles or other predefined roles.

Enable the Dataplex Universal Catalog API

  1. Review the permissions required to complete the tasks in this guide.
  2. In the Google Cloud console, go to the project selector page.

    Go to project selector

  3. Select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.
  4. Verify that billing is enabled for your Google Cloud project.

  5. Enable the Dataplex API.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

    Enable the API

  6. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

    If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

Install MCP Toolbox

You don't need to install MCP Toolbox if you only plan to use Gemini Code Assist or the Gemini CLI extension, as they bundle the required server capabilities. For other IDEs and tools, follow the steps in this section to install MCP Toolbox.

  1. Download the latest version of MCP Toolbox as a binary. Select the binary that corresponds to your (OS) and CPU architecture. You must use MCP Toolbox v0.15.0 or later.

    Linux/amd64

    curl -O https://storage.googleapis.com/genai-toolbox/VERSION/linux/amd64/toolbox
    

    Replace VERSION with MCP Toolbox version—for example, v0.15.0.

    macOS (Darwin)/arm64

    curl -O https://storage.googleapis.com/genai-toolbox/VERSION/darwin/arm64/toolbox
    

    Replace VERSION with MCP Toolbox version—for example, v0.15.0.

    macOS (Darwin)/amd64

    curl -O https://storage.googleapis.com/genai-toolbox/VERSION/darwin/amd64/toolbox
    

    Replace VERSION with MCP Toolbox version—for example, v0.15.0.

    Windows/amd64

    curl -O https://storage.googleapis.com/genai-toolbox/VERSION/windows/amd64/toolbox
    

    Replace VERSION with MCP Toolbox version—for example, v0.15.0.

  2. Make the binary executable:

    chmod +x toolbox
    
  3. Verify the installation:

    ./toolbox --version
    

    A successful installation returns the version number, for example, 0.15.0.

Set up clients and connections

This section explains how to connect Dataplex Universal Catalog to your tools.

If you are using Gemini Code Assist or the standalone Gemini CLI, you don't need to install or configure MCP Toolbox, as these tools bundle the required server capabilities. For setup instructions, see the Gemini Code Assist or Gemini CLI Extension tabs.

For other MCP-compatible tools and IDEs, you must first install MCP Toolbox. The toolbox acts as an open-source Model Context Protocol (MCP) server that sits between your IDE and Dataplex Universal Catalog, providing a secure and efficient control plane for your AI tools. After installation, select the tab for your specific tool to see configuration instructions.

Gemini CLI Extension

This method uses the dedicated dataplex extension for the standalone Gemini CLI tool, and does not use MCP Toolbox.

  1. Install the Gemini CLI.
  2. Install the Dataplex Universal Catalog extension for Gemini CLI from the GitHub repository:
    gemini extensions install https://github.com/gemini-cli-extensions/dataplex
  3. Set the environment variable to connect to your Dataplex Universal Catalog project:
    export DATAPLEX_PROJECT="PROJECT_ID"

    Replace PROJECT_ID with your Google Cloud project ID.

  4. Start the Gemini CLI in interactive mode:
    gemini
    The CLI automatically loads the Dataplex Universal Catalog extension and its tools, which you can use to interact with your data assets.

Gemini Code Assist

Gemini Code Assist bundles the required MCP server capabilities, so you don't need to install MCP Toolbox separately.

  1. In VS Code, install the Gemini Code Assist extension.
  2. Enable Agent Mode in Gemini Code Assist chat.
  3. In your working directory, create a folder named .gemini. Within that, create a settings.json file.
  4. Add the following configuration, replace the environment variables with your values, and save:
      {
        "mcpServers": {
          "dataplex": {
            "command": "./PATH/TO/toolbox",
            "args": ["--prebuilt","dataplex","--stdio"],
            "env": {
              "DATAPLEX_PROJECT": "PROJECT_ID"
            }
          }
        }
      }
      

Claude code

  1. Install Claude Code.
  2. Create .mcp.json file in your project root, if it doesn't exist.
  3. Add the configuration, replace the environment variables with your values, and save:
      {
        "mcpServers": {
          "dataplex": {
            "command": "./PATH/TO/toolbox",
            "args": ["--prebuilt","dataplex","--stdio"],
            "env": {
              "DATAPLEX_PROJECT": "PROJECT_ID"
            }
          }
        }
      }
      

Claude desktop

  1. Open Claude Desktop and navigate to Settings.
  2. To open the configuration file, in the Developer tab, click Edit config.
  3. Add the configuration, replace the environment variables with your values, and save:
      {
        "mcpServers": {
          "dataplex": {
            "command": "./PATH/TO/toolbox",
            "args": ["--prebuilt","dataplex","--stdio"],
            "env": {
              "DATAPLEX_PROJECT": "PROJECT_ID"
            }
          }
        }
      }
      
  4. Restart Claude desktop.
    The new chat screen displays an MCP icon with the new MCP server.

Cline

  1. In VS Code, open the Cline extension and then click the MCP Servers icon.
  2. To open the configuration file, tap Configure MCP Servers.
  3. Add the following configuration, replace the environment variables with your values, and save:
      {
        "mcpServers": {
          "dataplex": {
            "command": "./PATH/TO/toolbox",
            "args": ["--prebuilt","dataplex","--stdio"],
            "env": {
              "DATAPLEX_PROJECT": "PROJECT_ID"
            }
          }
        }
      }
      
    A green active status appears after the server connects successfully.

Cursor

  1. Create the .cursor directory in your project root if it doesn't exist.
  2. Create the .cursor/mcp.json file if it doesn't exist and open it.
  3. Add the following configuration, replace the environment variables with your values, and save:
      {
        "mcpServers": {
          "dataplex": {
            "command": "./PATH/TO/toolbox",
            "args": ["--prebuilt","dataplex","--stdio"],
            "env": {
              "DATAPLEX_PROJECT": "PROJECT_ID"
            }
          }
        }
      }
      
  4. Open Cursor and navigate to Settings>Cursor Settings > MCP. A green active status appears when the server connects.

VS Code (Copilot)

  1. Open VS Code and create .vscode directory in your project root if it doesn't exist.
  2. Create the .vscode/mcp.json file if it doesn't exist, and open it.
  3. Add the following configuration, replace the environment variables with your values, and save:
      {
        "servers": {
          "dataplex": {
            "command": "./PATH/TO/toolbox",
            "args": ["--prebuilt","dataplex","--stdio"],
            "env": {
              "DATAPLEX_PROJECT": "PROJECT_ID"
            }
          }
        }
      }
      

Windsurf

  1. Open Windsurf and navigate to Cascade assistant.
  2. To open the configuration file, click the MCP icon, then click Configure.
  3. Add the following configuration, replace the environment variables with your values, and save:
      {
        "mcpServers": {
          "dataplex": {
            "command": "./PATH/TO/toolbox",
            "args": ["--prebuilt","dataplex","--stdio"],
            "env": {
              "DATAPLEX_PROJECT": "PROJECT_ID"
            }
          }
        }
      }
      

Use the tools

Your AI tool is now connected to Dataplex Universal Catalog. Try asking your AI assistant to find some data assets such as BigQuery datasets, Cloud SQL instances, and others.

The following tools are available to the LLM:

Optional: Add system instructions

System instructions are a way to provide specific guidelines to the LLM, helping it to understand the context and respond more accurately. Set up system instructions based on the recommended system prompt.

For example, you can add instructions to guide the LLM on how to use the Dataplex Universal Catalog tools:

  • When asked to find datasets or tables, use the dataplex_search_entries tool.
  • If asked for table schema or metadata details like data quality rules or ownership, use the dataplex_lookup_entry tool.
  • When asked about governance rules or classifications, start by using dataplex_search_aspect_types to find relevant aspect types.

For more information about how to configure instructions, see Use instructions to get AI edits that follow your coding style.

What's next