Beyond the pilot: Five hard-won lessons from Google Cloud’s AI transformation

Amy Liu
Head of AI Solutions, Value Creation
Raymond Peng
Senior Principal, AI Value Creation, Google Cloud
Here's a five-step framework for scaling AI from experimental prototypes to high-value, integrated enterprise production.
According to our 2025 ROI of AI Report, the era of AI experimentation is rapidly shifting into production: more than half (52%) of executives report that their organizations are now actively using AI agents, with 39% having already launched ten or more. At the same time, organizations are discovering that while prototyping is a relatively easy lift, it’s much harder to scale sustained usage and AI fluency across multiple functions, such as sales, marketing, human resources, and operations. In particular, many of them are now hitting a wall of AI Sprawl, with uncoordinated strategies leading to fragmented workflows, governance risks, and a struggle to ground models in true enterprise context.
At Google Cloud, we are navigating this transition within our own Go-To-Market (GTM) organization, moving past the initial stages of generative AI and into the grit of enterprise-scale deployment. To help leaders and executives navigate these innovation-stage growing pains, we’ve distilled our internal journey into five core lessons for maturing experimentation into sustainable, high-value AI integration.
1. Curate data and context for AI
AI agents and applications can only be as good as the context and data they are given — that’s why models need to be grounded in what we call “enterprise truth.” You might build a brilliant model on day one, but relying on static documents like PDFs or Drive folders to provide context, can inadvertently trap your team in a never-ending cycle of manual uploads to refresh data as the business evolves.
The new approach: From static to dynamic
We shifted our focus to dynamic streams of data. Instead of just attaching raw files and documents, we use routine automated jobs, where multimodal Gemini models search, curate, and extract granular pieces of data from across repositories that provide the exact context needed. Using the Agent2Agent (A2A) Protocol, our GTM AI agents can also retrieve information from agents that other teams are building across Google Cloud, trained on their specialized data and expertise. This approach allows us to instantly access curated datasets from multiple teams without having to build data pipelines or figure out how to unify data.
Our top takeaway Performance hinges on selecting the right data — not the most. We have found that smaller, targeted datasets of relevant information consistently outperform massive, comprehensive data dumps. Even when connecting additional sources to AI agents, we also implement interim layers of automation to filter out any irrelevant or redundant information.
Putting this strategy into action relies on the following key enablers:
-
People: Data stewards that can identify trusted enterprise data sources and prompt engineers to help automate data extraction pipelines.
-
Process: Automated workflows that provide complete, accurate, and real-time data, minimizing the risks of AI hallucinations.
-
Tooling: Unified platforms where AI and data reside together, such as BigQuery or Gemini, to create a self-sustaining “flywheel” that can continuously feed AI models with real-time data and context.
2. Design the optimal user experience
Within the industry, a common misconception is that AI just means building a chatbot. However, open-ended chat interfaces often lack a clear starting point for enterprise users, limiting their ability to get consistent, high-quality results that can be used in their daily workflows. They don’t want to spend 20 minutes on prompt engineering — they want instructions on what to fill out and a button to click that executes an existing, expertly-crafted prompt.
The new approach: Guided task completion
Instead of chat-based prompting, we pivoted to a guided agent experience in a web application that requires the fewest possible inputs to deliver customized output, ready for immediate use in existing workflows. For example, we built an internal AI agent with a pre-defined system prompt that generates tailored presentation slides and documents for our GTM sellers — using just a company name and a few account details. In addition to the context from our sellers, the prompt also uses Grounding with Google Search to retrieve accurate, fresh company research and connects to Google Workspace to automatically generate templated assets that adhere to our official branding guidelines.
Our top takeaway
Most employees simply want to complete a task. Our responsibility is to design the most optimal experience to achieve that goal without complicating their daily work. When we removed the guesswork of conversational prompting, adoption rocketed among sellers. We suggest partnering closely with business users to map their current processes and identify the most impactful areas for automation, targeting high-frequency use cases first.
Putting this strategy into action relies on the following key enablers:
-
People: Business subject matter experts who can design workflows based on how they actually work, not how IT and development teams think they work.
-
Process: Collaborative design cycles that focus on ready-to-use outputs (e.g., emails, slides, documents, etc.) that fit into existing habits.
-
Tooling: Simple user experiences that reduce common user friction points, such as prompting and manual content uploads.
3. Prioritize results over perfection
Another core challenge is that AI requires a different mindset from traditional IT, which typically demands perfection before a launch. However, AI agents are non-deterministic, meaning they might produce different results on different occasions — even when given the same prompt. In addition, the value and appeal of new AI capabilities are still relatively unproven, and many of these answers can only be gained through real-world use.
The new approach: Launch and iterate
Trying to address every one-off error significantly slowed our time to market, so we adopted an incremental deployment model. To move faster, we reduced our initial scope and opted for manual workarounds to launch as quickly as possible. For example, we vibe coded a simple web interface for the first version of our internal GTM AI agent using Gemini models and our rapid application development platform, Apps Script. Native integration withWorkspace allowed us to automate slide generation and use Google Sheets as a basic database. This approach allowed us to bypass engineering bottlenecks and put tools in sellers' hands for testing immediately. We then used star ratings and gathered feedback via Google Forms and user interviews to identify patterns. With a simple, integrated stack, we could refine the prompts and code to deliver updates fast, either on the same day or within days.
Our top takeaway
Waiting for “perfect” often means never launching at all and missing the chance to truly understand if your project can scale. By focusing on a core AI capability, we eliminated potential data dependencies and complex integrations that would have slowed us down, drastically shortening our development cycles. With AI, the primary focus should be on getting use cases live immediately, even if you have to adopt workarounds until your data and integrations are ready.
Putting this strategy into action relies on the following key enablers:
-
People: AI champions and expert reviewers who can rigorously assess AI outputs and create evaluation standards to train, refine, and guide AI models.
-
Process: Intuitive, easy-to-use feedback loops (i.e., 1–5 star rating scale and focus groups) to identify emerging patterns, catch recurring errors, and understand how people may use your tools.
-
Tooling: Low-code environments (e.g., Apps Script, Antigravity, or Vertex AI Studio), and native integrations to deploy prototypes without complex, costly development cycles while enabling business users to function like engineers.
4. Thrive in agent sprawl
As adoption grows, you will inevitably see multiple teams building overlapping capabilities, potentially creating confusion for users. Yet, trying to be a strict gatekeeper for every AI project can stifle innovation. We found that taking a centralized approach often meant the majority of ideas from the field were not funded or prioritized, dramatically slowing down organizational AI fluency and adoption.
The new approach: Interoperable "bodyless" agents
To minimize rework and overlapping functionality, we embraced an "atomic" agent model to maximize interoperability with other development teams. We designed our agents to be deployable wherever users are using the A2A Protocol and Google’s Agent Development Kit (ADK) — whether that’s in Gemini Enterprise, Workspace, or a custom web application. We design atomic AI agents around reusable functions, such as finding and retrieving information, which can be called or embedded into any AI application. We are also moving towards a federated data model, where the owners of source data, such as a product team, can expose that data through their own specific agents.
Our top takeaway
Build for where your users already are, not where you want them to be. Multiple teams should be able to use AI agents wherever they need them without having to build them from scratch. Instead of separate teams developing the same AI agent, our vision is to provide core agents that can be reused for different purposes wherever they want. In other words, we want to create an ecosystem of interoperable agents and shared access to source data — all underpinned by responsible AI principles and practices — allowing for flexibility and innovation at the team level while ensuring standardization and reusability across the entire organization.
Putting this strategy into action relies on the following key enablers:
-
People: Centralized technical teams that can step in to standardize high-traction tools for enterprise use once they prove their value.
-
Process: Quarterly synchronization meetings to align roadmaps and reduce redundant work.
-
Tooling: Standards and frameworks like ADK and the A2A Protocol, to ensure that agents are built to communicate and work together, along with robust APIs and development environments that make it easy to discover existing AI components.
5. Measure outcomes — not just activity
Measuring the performance and return on investment (ROI) of AI strategies can present new challenges, as key performance indicators (KPIs) for AI don’t necessarily clearly correlate with tangible business results. It’s easy to track how many people used a tool, but harder to prove that it shortened a sales cycle. If a tool saves an employee two hours a week, where does that time go? Is it being reinvested into high-value work or just filling the gap with more administrative tasks?
The new approach: Adoption, sentiment, and impact
To capture more actionable insights, we adopted a three-pronged approach to measure the impact of our AI tools. First, we built tracking to provide visibility into which AI features were used the most, by whom, and for what activities. Second, we provided easy access to feedback channels, including star ratings and chat groups, and regularly conducted focus groups and interviews for more in-depth feedback. Lastly, we tied usage to specific entities and efforts, such as specific customer accounts or sales opportunities. Together, these metrics deliver more granular clarity around AI adoption and user sentiment while making it easier to correlate AI use with downstream business outcomes.
Our top takeaway
Usage data is key for driving adoption. Even if you can’t connect AI tool usage directly to business value, adoption volume and user sentiment can serve as effective proxies for how helpful an AI tool is for employees. Start with the easy-to-track KPIs while you build more sophisticated impact analytics. Accepting that the tool you launch today is the worst it will ever be can help you focus on making tangible, measurable progress towards your objectives.
Putting this strategy into action relies on the following key enablers:
-
People: Executive sponsorship for a culture of psychological safety that encourages experimentation even while ROI is still maturing.
-
Process: Mechanisms for tracking adoption and user sentiment to measure tool quality and effectiveness.
-
Tooling: Advanced analytics dashboards that slice and present AI usage by feature and user group, with the ability to tie it back to core KPIs (e.g., revenue, CSAT, churn rate, etc.) when possible.
Unlock the full potential of your AI investments in 2026
As you plan your next phase of AI initiatives, the goal isn't just to "have AI" — it's achieving long-term, enterprise-grade success. Many organizations are doing all the right things but don’t have a system in place for transforming their early wins into organized, structured, and repeatable success. The critical learnings outlined above can serve as a practical framework for operationalizing and sustaining AI value, not just this year but well into the future.



