Jump to Content
Partners

Building scalable AI agents: Design patterns with Agent Engine on Google Cloud

October 20, 2025
Schneider Larbi

Principal Partner Engineer

David Peterside

Partner Engineer

Try Gemini 2.5

Our most intelligent model is now available on Vertex AI

Try now

AI Agents are now a reality, moving beyond chatbots to understand intent, collaborate, and execute complex workflows. This leads to increased efficiency, lower costs, and improved customer and employee experiences. This is a key opportunity for System Integrator (SI) Partners to deliver Google Cloud’s advanced AI to more customers. This post details how to build, scale, and manage enterprise-grade agentic systems using Google Cloud AI products to enable SI Partners to offer these transformative solutions to enterprise clients.

Enterprise challenges

The limitations of traditional, rule-based automation are becoming increasingly apparent in the face of today’s complex business challenges. Its inherent rigidity often leads to protracted approval processes, outdated risk models, and a critical lack of agility, thereby impeding the ability to seize new opportunities and respond effectively to operational demands.

Modern enterprises are further compounded by fragmented IT landscapes, characterized by legacy systems and siloed data, which collectively hinder seamless integration and scalable growth. Furthermore, static systems are ill-equipped to adapt instantaneously to market volatility or unforeseen "black swan" events. They also fall short in delivering the personalization and operational optimization required to manage escalating complexity—such as in cybersecurity and resource allocation—at scale. In this dynamic environment, AI agents offer the necessary paradigm shift to overcome these persistent limitations.

How SI Partners are solving business challenges with AI agents

Let's discuss how SIs are working with Google Cloud to solve some of the discussed business challenges;

Deloitte: A major retail client sought to enhance inventory accuracy and streamline reconciliation across its diverse store locations. The client needed various users—Merchants, Supply Chain, Marketing, and Inventory Controls—to interact with inventory data through natural language prompts. This interaction would enable them to check inventory levels, detect anomalies, research reconciliation data, and execute automated actions.

Deloitte leveraged Google Cloud AI Agents and Gemini Enterprise  to create a solution that generates insights, identifies discrepancies, and offers actionable recommendations based on inventory data. This solution utilizes Agentic AI to integrate disparate data sources and deliver real-time recommendations, ultimately aiming to foster trust and confidence in the underlying inventory data.

Quantiphi: To improve customer experience and optimize sales operations, a furniture manufacturer partnered with Quantiphi to deploy Generative AI. to create a dynamic intelligent assistant on Google Cloud. The multi-agent system automates the process of quotation response creation thereby accelerating and speeding the process. At its core is an orchestrator, built with Agent Development Kit (ADK) and an Agent to Agent (A2A) framework that seamlessly coordinates between agents to summarize the right response - whether you're researching market trends, asking about product details, or analyzing sales data. Leveraging the cutting-edge capabilities of Google Cloud’s Gemini models and BigQuery, the assistant delivers unparalleled insights, transforming how one can access data and make decisions. 

These examples represent just a fraction of the numerous use cases spanning diverse industry verticals, including healthcare, manufacturing, and financial services, that are being deployed in the field by SIs working in close collaboration with Google Cloud. 

Architecture and design patterns used by SIs

The strong partnership between Google Cloud and SIs is instrumental in delivering true business value to customers. Let's examine the scalable architecture patterns employed by Google Cloud SIs in the field to tackle Agentic AI challenges.

To comprehend Agentic AI architectures, it's crucial to first understand what an AI agent is. An AI agent is a software entity endowed with the capacity to plan, reason, and execute complex actions for users with minimal human intervention. AI agents leverage advanced AI models for reasoning and informed decision-making, while utilizing tools to fetch data from external sources for real-time and grounded information. Agents typically operate within a compute runtime. The visual diagram  illustrates the basic components of an agent;

https://storage.googleapis.com/gweb-cloudblog-publish/images/image2_0O4Fuj5.max-1200x1200.png

Base AI Agent Components

The snippet below also demonstrates how an Agent's code appears in the Python programming language;

https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_NzKVIVZ.max-900x900.png

Code snippet of an AI Agent

This agent code snippet showcases the components depicted in the first diagram, where we observe the Agent with a Name, Large Language Model (LLM), Description, Instruction and Tools, all of which are utilized to enable the agent to perform its designated functions.

To build enterprise-grade agents at scale, several factors must be considered during their ground-up development. Google Cloud has collaborated closely with its Partner ecosystem to employ cutting-edge Google Cloud products to build scalable and enterprise-ready agents.

A key consideration in agent development is the framework. Without it, developers would be compelled to build everything from scratch, including state management, tool handling, and workflow orchestration. This often results in systems that are complex, difficult to debug, insecure, and ultimately unscalable. Google Cloud Agent Development Kit (ADK) provides essential scaffolding, tools, and patterns for efficient and secure enterprise agent development at scale. It offers developers the flexibility to customize agents to suit nearly every applicable use case.

Agent development with any framework, especially multi-agent architectures in enterprises, necessitates robust compute resources and scalable infrastructure. This includes strong security measures, comprehensive tracing, logging, and monitoring capabilities, as well as rigorous evaluation of the agent’s decisions and output.

Furthermore, agents typically lack inherent memory, meaning they cannot recall past interactions or maintain context for effective operation. While frameworks like ADK offer ephemeral memory storage for agents, enterprise-grade agents demand persistent memory. This persistent memory is vital for equipping agents with the necessary context to enhance their performance and the quality of their output.

Google Cloud’s Vertex AI Agent Engine provides a secure runtime for agents that manages their lifecycle, orchestrates tools, and drives reasoning. It features built-in security, observability, and critical building blocks such as a memory bank, session service, and sandbox. Agent Engine is accessible to SIs and customers on Google Cloud. Alternative options for running agents at scale include Cloud Run or GKE

Customers often opt for these alternatives when they already have existing investments in Cloud Run or GKE infrastructure on Google Cloud, or when they require configuration flexibility concerning compute, storage, and networking, as well as flexible cost management. However, when choosing Cloud Run or GKE, functions like memory and session management must be built and managed from the ground up.

Model Context Protocol (MCP) is a crucial element for modern AI agent architectures. This open protocol standardizes how applications provide context to LLMs, thereby improving agent responses by connecting agents and underlying AI models to various data sources and tools. It's important to note that Agents also communicate with enterprise systems using APIs, which are referred to as Tools when employed with agents. MCP enables agents to access fresh external data.

When developing enterprise agents at scale, it is recommended to deploy the MCP servers separately on a serverless platform like Cloud Run or GKE on Google Cloud, with agents running on Agent Engine configured as clients. The sample architecture illustrates the recommended deployment model for MCP integration with ADK agents;

https://storage.googleapis.com/gweb-cloudblog-publish/images/image4_jzjerKh.max-1100x1100.png

AI agent tool integration with MCP

The reference architecture demonstrates how ADK built agents can integrate with MCP to connect data sources and provide context to underlying LLM models. The MCP utilizes Get, Invoke, List, and Call functions to enable tools to connect agents to external data sources. In this scenario, the agent can interact with a Graph database through application APIs using MCP, allowing the agent and the underlying LLM to access up-to-date data for generating meaningful responses.

Furthermore, when building multi-agent architectures that demand interoperability and communication among agents from different systems, a key consideration is how to facilitate Agent-to-Agent communication. This addresses complex use cases that require workflow execution across various agents from different domains. 

Google Cloud launched the Agent-to-Agent Protocol (A2A) with native support within Agent Engine to tackle the challenge of inter-agent communication at scale. Learn how to implement A2A from this blog.

Google Cloud has collaborated with SIs on agentic architecture and design considerations to build multiple agents, assisting clients in addressing various use cases across industry domains such as Retail, Manufacturing, Healthcare, Automotive, and Financial Services. The reference architecture below consolidates these considerations.

https://storage.googleapis.com/gweb-cloudblog-publish/images/image3_9qMCmk5.max-1000x1000.png

Reference architecture - Agentic AI system with ADK, MCP, A2A and Agent Engine

This reference architecture depicts an enterprise-grade Agent built on Google Cloud to address a supply chain use case. In this architecture, all agents are built with the ADK framework and deployed on Agent Engine. Agent Engine provides a secure compute runtime with authentication, context management using managed sessions, memory, and quality assurance through Example Store and Evaluation Services, while also offering observability into the deployed agents. Agent Engine delivers all these features and many more as a managed service at scale on GCP. 

This architecture outlines an Agentic supply chain featuring an orchestration agent (Root) and three dedicated sub-agents: Tracking, Distributor, and Order Agents. Each of these agents are powered by Gemini. For optimal performance and tailored responses, especially in specific use cases, we recommend tuning your model with domain-specific data before integration with an agent. Model tuning can also help optimize responses for conciseness, potentially leading to reduced token size and lower operational costs.

For instance, a user might send a request such as "show me the inventory levels for men’s backpack." The Root agent receives this request and is capable of routing it to the Order agent, which is responsible for inventory and order operations. This routing is seamless because the A2A protocol utilizes agent cards to advertise the capabilities of each respective agent. A2A is configured with a few steps as a wrapper for your agents for Agent Engine deployment.

In this example, inventory and order details are stored in BigQuery. Therefore, the agent uses its tool configuration to leverage the MCP server to fetch the inventory details from the BigQuery data warehouse. The response is then returned to the underlying LLM, which generates a formatted natural language response and provides the inventory details for men’s backpacks to the Root agent and subsequently to the user. Based on this response, the user can, for example, place an order to replenish the inventory. 

When such a request is made, the Root agent routes it to the Distributor agent. This agent possesses knowledge of all suppliers who provide stock to the business. Depending on the item being requested, the agent will use its tools to initiate an MCP server connection to the correct external API endpoints for the respective supplier to place the order. If the suppliers have agents configured, the A2A protocol can also be utilized to send the request to the supplier's agent for processing. Any acknowledgment of the order is then sent back to the Distributor agent

In this reference architecture, when the Distributor agent receives acknowledgment, A2A enables the agent to detect the presence of a Tracking agent that monitors new orders until delivery. The Distributor agent will pass the order details to the Tracking agent and also send updates back to the user. The Tracking agent will then send order updates to the user via messaging, utilizing the public API endpoint of the supplier. This is merely one example of a workflow that could be built with this reference architecture.

This modular architecture can be adapted to solve various use cases with Agentic AI built with ADK and deployed to Agent Engine.

The reference architecture allows this multi-agent system to be consumed via a chat interface through a website or a custom-built user interface. It is also possible to integrate this agentic AI architecture with Google Cloud Gemini Enterprise.

 Learn how enterprises can start by using Gemini Enterprise as the front door to Google Cloud AI from this blog from Alphabet CEO Sundar Pichai. This approach helps enterprises to start small using low code out of the box agents. As they mature, they can now implement complex use cases with advanced high code AI agents using this reference archiecture . 

Getting started

This blog post has explored the design patterns for building intelligent enterprise AI agents. For enterprise decision makers, use the 5 essential elements to start implementing agentic solutions to help guide your visionary strategy and decision making when it comes to running enterprise agents at scale.

We encourage you to embark on this journey today by collaborating with Google Cloud Partner Ecosystem to understand your enterprise landscape and identify complex use cases that can be effectively addressed with AI Agents. Utilize these design patterns as your guide and leverage the ADK to transform your enterprise use case into a powerful, scalable solution that delivers tangible business value on Agent Engine with Google Cloud.

Posted in