Deploy an agent

To deploy an agent on Vertex AI Agent Engine, choose between two primary methods:

  • Deploy from an agent object: Ideal for interactive development in environments like Colab, enabling deployment of in-memory local_agent objects. This method works best for agents with structures that don't contain complex, non-serializable components.
  • Deploy from source files: This method is well-suited for automated workflows such as CI/CD pipelines and Infrastructure as Code tools like Terraform, enabling fully declarative and automated deployments. It deploys your agent directly from local source code and does not require a Cloud Storage bucket.

To get started, use the following steps:

  1. Complete prerequisites.
  2. (Optional) Configure your agent for deployment.
  3. Create an AgentEngine instance.
  4. (Optional) Get the agent resource ID.
  5. (Optional) List the supported operations.
  6. (Optional) Grant the deployed agent permissions.

You can also use Agent Starter Pack templates for deployment.

Prerequisites

Before you deploy an agent, make sure you have completed the following tasks:

  1. Set up your environment.
  2. Develop an agent.

(Optional) Configure your agent for deployment

You can make the following optional configurations for your agent:

Create an AgentEngine instance

This section describes how to create an AgentEngine instance for deploying an agent.

To deploy an agent on Vertex AI Agent Engine, you can choose between the following methods:

  • Deploying from an agent object for interactive development.
  • Deploying from source files for automated, file-based workflows.

From an agent object

To deploy the agent on Vertex AI, use client.agent_engines.create to pass in the local_agent object along with any optional configurations:

remote_agent = client.agent_engines.create(
    agent=local_agent,                                  # Optional.
    config={
        "requirements": requirements,                   # Optional.
        "extra_packages": extra_packages,               # Optional.
        "gcs_dir_name": gcs_dir_name,                   # Optional.
        "display_name": display_name,                   # Optional.
        "description": description,                     # Optional.
        "labels": labels,                               # Optional.
        "env_vars": env_vars,                           # Optional.
        "build_options": build_options,                 # Optional.
        "service_account": service_account,             # Optional.
        "min_instances": min_instances,                 # Optional.
        "max_instances": max_instances,                 # Optional.
        "resource_limits": resource_limits,             # Optional.
        "container_concurrency": container_concurrency, # Optional
        "encryption_spec": encryption_spec,             # Optional.
    },
)

Deployment takes a few minutes, during which the following steps happen in the background:

  1. A bundle of the following artifacts are generated locally:

    • *.pkl a pickle file corresponding to local_agent.
    • requirements.txt a text file containing the package requirements.
    • dependencies.tar.gz a tar file containing any extra packages.
  2. The bundle is uploaded to Cloud Storage (under the corresponding folder) for staging the artifacts.

  3. The Cloud Storage URIs for the respective artifacts are specified in the PackageSpec.

  4. The Vertex AI Agent Engine service receives the request and builds containers and starts HTTP servers on the backend.

Deployment latency depends on the total time it takes to install required packages. Once deployed, remote_agent corresponds to an instance of local_agent that is running on Vertex AI and can be queried or deleted.

The remote_agent object corresponds to an AgentEngine class that contains the following attributes:

From source files

To deploy from source files on Vertex AI, use client.agent_engines.create by providing source_packages, entrypoint_module, entrypoint_object, and class_methods in the config dictionary, along with other optional configurations. With this method, you don't need to pass an agent object or Cloud Storage bucket.

remote_agent = client.agent_engines.create(
    config={
        "source_packages": source_packages,             # Required.
        "entrypoint_module": entrypoint_module,         # Required.
        "entrypoint_object": entrypoint_object,         # Required.
        "class_methods": class_methods,                 # Required.
        "requirements_file": requirements_file,         # Optional.
        "display_name": display_name,                   # Optional.
        "description": description,                     # Optional.
        "labels": labels,                               # Optional.
        "env_vars": env_vars,                           # Optional.
        "build_options": build_options,                 # Optional.
        "service_account": service_account,             # Optional.
        "min_instances": min_instances,                 # Optional.
        "max_instances": max_instances,                 # Optional.
        "resource_limits": resource_limits,             # Optional.
        "container_concurrency": container_concurrency, # Optional
        "encryption_spec": encryption_spec,             # Optional.
    },
)

The parameters for inline source deployment are:

  • source_packages (Required, list[str]): A list of local file or directory paths to include in the deployment. The total size of the files and directories in source_packages shouldn't exceed 8MB.
  • entrypoint_module (Required, str): The fully qualified Python module name containing the agent entrypoint (for example, agent_dir.agent).
  • entrypoint_object (Required, str): The name of the callable object within the entrypoint_module that represents the agent application (for example, root_agent).
  • class_methods (Required, list[dict]): A list of dictionaries that define the agent's exposed methods. Each dictionary includes a name (Required), an api_mode (Required), and a parameters field. Refer to List supported operations for more information a the methods for a custom agent.

    For example:

      "class_methods": [
          {
              "name": "method_name",
              "api_mode": "", # Possible options are: "", "async", "async_stream", "stream", "bidi_stream"
              "parameters": {
                  "type": "object",
                  "properties": {
                      "param1": {"type": "string", "description": "Description of param1"},
                      "param2": {"type": "integer"}
                  },
                  "required": ["param1"]
              }
          }
      ]
      ```
    
  • requirements_file (Optional, str): The path to a pip requirements file within the paths specified in source_packages. Defaults to requirements.txt at the root directory of the packaged source.

Deployment takes a few minutes, during which the following steps happen in the background:

  1. The Vertex AI SDK creates a tar.gz archive of the paths specified in source_packages.
  2. This archive is encoded and sent directly to the Vertex AI API.
  3. The Vertex AI Agent Engine service receives the archive, extracts it, installs dependencies from requirements_file (if provided), and starts the agent application using the specified entrypoint_module and entrypoint_object.

Deployment latency depends on the total time it takes to install required packages. Once deployed, remote_agent corresponds to an instance of the agent application that is running on Vertex AI and can be queried or deleted.

The remote_agent object corresponds to an AgentEngine class that contains the following attributes:

The following is an example of deploying an agent from source files:

from google.cloud.aiplatform import vertexai

# Example file structure:
# /agent_directory
#     ├── agent.py
#     ├── requirements.txt

# Example agent_directory/agent.py:
# class MyAgent:
#     def ask(self, question: str) -> str:
#         return f"Answer to {question}"
# root_agent = MyAgent()

remote_agent = vertexai.preview.generative_models.AgentEngine.create(
  config={
      "display_name": "My Agent",
      "description": "An agent deployed from a local source.",
      "source_packages": ["agent_directory"],
      "entrypoint_module": "agent_directory.agent",
      "entrypoint_object": "root_agent",
      "requirements_file": "requirements.txt",
      "class_methods": [
          {"name": "ask", "api_mode": "", "parameters": {
              "type": "object",
              "properties": {
                  "question": {"type": "string"}
              },
              "required": ["question"]
          }},
      ],
      # Other optional configs:
      # "env_vars": {...},
      # "service_account": "...",
  }
)

(Optional) Get the agent resource ID

Each deployed agent has a unique identifier. You can run the following command to get the resource name for your deployed agent:

remote_agent.api_resource.name

The response should look like the following string:

"projects/PROJECT_NUMBER/locations/LOCATION/reasoningEngines/RESOURCE_ID"

where

  • PROJECT_ID is the Google Cloud project ID where the deployed agent runs.

  • LOCATION is the region where the deployed agent runs.

  • RESOURCE_ID is the ID of the deployed agent as a reasoningEngine resource.

(Optional) List the supported operations

Each deployed agent has a list of supported operations. You can run the following command to get the list of operations supported by the deployed agent:

remote_agent.operation_schemas()

The schema for each operation is a dictionary that documents the information of a method for the agent that you can call. The set of supported operations depends on the framework you used to develop your agent:

(Optional) Grant the deployed agent permissions

If the deployed agent needs to be granted any additional permissions, follow the instructions in Set up the identity and permissions for your agent.

What's next