Cuts infrastructure costs by ~80% with Vertex AI
Achieves <1% error rate using Gemini in Vertex AI
Reduces call escalations to <15% with Vertex AI
Maintains sub-2-second latency through Google Kubernetes Engine (GKE)
Scales instantly to 3x traffic spikes with Google Kubernetes Engine (GKE)
Trillet resolves 85% of customer service calls with human-like voice AI, cutting costs by 80% and errors to under 1%.

We set the temperature to zero—the lowest possible value for free will—and we still ran into issues. One out of 20 calls, the AI would say a 4 PM slot was available for absolutely no reason. With Gemini in Vertex AI we gained the strictness, large context window, and predictability we needed for sensitive industries such as healthcare.
Ming Xu
COO, Trillet.ai
Trillet.ai connects raw generative AI models to the operational systems of businesses of all sizes. From rescheduling medical appointments to conducting government housing follow-ups, the company enables organizations to automate high-stakes voice interactions. These are sophisticated agents required to navigate complex workflows, retrieve real-time data, and execute tasks with precision.
High latency from the platform's early iterations caused the AI to speak over customers, creating an uncanny delay that frustrated callers. Additionally, the early LLMs disregarded hard business constraints. A dental clinic had no availability at a specific time, yet the AI booked a patient for that slot. To overcome hallucinations, the engineering team had to manually audit thousands of call logs.
The founders found themselves trapped working in the business—managing a 5% error rate— rather than building the infrastructure needed to fulfill their vision.
Trillet sought a platform capable of strict adherence to "negative constraints"—ensuring the model would respect what it could not do. The ideal solution required a large context window to process extensive business rules and the processing speed to handle natural interruptions, turning a script into a natural, human-like conversation.
Trillet migrated its infrastructure to Google Cloud to gain tighter control over model behavior and system latency. The company integrated Gemini in Vertex AI as its primary reasoning engine, utilizing the model's large context window to ingest hundreds of pages of specific business logic and strict constraints for every client.
To handle the high-volume, real-time demands of voice traffic, Trillet deployed its core application on Google Kubernetes Engine (GKE). This architecture allows the platform to scale automatically during morning peaks—when call volumes often triple—ensuring that every interaction remains responsive.
Moving to Google Cloud gave us the unified infrastructure we needed. We went from managing a fragmented 'black box' to a scalable, integrated environment where Google Kubernetes Engine and Vertex AI work together to handle thousands of calls simultaneously without a hitch.
Ming Xu
COO, Trillet.ai
Now, when a patient calls a clinic to reschedule an appointment:
With Vertex AI, engineers can deploy complex new workflows knowing the model will adhere to their "zero-freedom" logic. The unified Google Cloud ecosystem allows the team to consolidate their stack, reducing the operational noise that previously distracted from R&D.

We reduced costs by 80% and built a system where people feel supported. In the legal space, users found the AI provided a more consistent and attentive experience than traditional systems. This is possible because the technology now follows our instructions perfectly.
Ming Xu
COO, Trillet.ai
The transition to Google Cloud moved Trillet's error rate from 5% to well below 1%, ending the issue of hallucinated appointments. The engineering team's workload shifted from manual call auditing to system optimization, while the efficiencies introduced by GKE brought infrastructure costs down by 80%.
With latency reduced to sub-two-second levels, the uncanny friction of early voice agents was replaced by fluid, natural conversations. In high-stakes environments, such as legal aid, Trillet's agents now resolve 85% of complex calls without human intervention. Post-call feedback revealed that many users felt heard, reporting that the AI provided a consistent, attentive interaction during stressful situations. For legal service clients, this shift in confidence resulted in a 10% increase in conversion rates.
The stability of the Google Cloud ecosystem allowed the small team to pivot from daily troubleshooting to high-level R&D. Future plans include exploring multimodal capabilities—such as agents that can "see" and discuss a patient's medical documents in real time—and deeper personalization that allows the AI to remember a caller's previous interactions. By building on this foundation, Trillet is positioning itself as the primary voice application layer for enterprise across the globe.
Trillet.ai provides an advanced voice application layer for enterprise, using generative AI to automate complex, high-stakes customer interactions with human-like empathy and technical precision.
Industry: Technology
Location: Australia
Products: Gemini in Vertex AI, Vertex AI, Google Kubernetes Engine (GKE), Cloud Load Balancing, Speech-to-Text, Text-to-Speech