20 million+ requests processed per day with 300-millisecond latencies on NVIDIA GPUs on Google Cloud
Scaled to over a petabyte of data and 5,000+ end users in two years
1,000+ AI applications de-risked with Galileo running on Google Cloud
Galileo's unique LLM reliability platform helps customers develop effective, trustworthy AI applications. The startup builds highly accurate "evaluation agents" with Gemini and quickly scaled its technology to support thousands of users and millions of daily requests with very low latency using NVIDIA GPUs running on Google Cloud.
Galileo uses Vertex AI, Gemini, GKE, and other Google Cloud tools to help developers build, ship, and scale reliable LLM-based generative and agentic AI applications on Google Cloud.
Yash Sheth
Co-founder and Chief Operating Officer, Galileo
"It takes a village to launch an AI application," says Yash Sheth with a gleam in his eye, paraphrasing the proverb usually quoted in reference to raising children. But he's only half-joking.
While he may not think of them as his kids, Sheth cares deeply about large language models (LLMs) — so deeply that, together with Atindriyo Sanyal and Vikram Chatterji, he co-founded a company whose platform helps developers build the best AI applications possible (without a whole village getting in on the act).
A software engineer and data scientist who worked on speech recognition technology and conversational AI, Sheth saw firsthand how hard it was to ensure that LLMs got things right.
By their very nature, LLMs are unpredictable, or "non-deterministic," in Developer-ese: Their output can vary even if the inputs remain the same. This makes it difficult for engineers to "de-risk" them, the process by which they ensure an LLM works correctly.
But "getting it right" involves a lot of rigorous, time-consuming experimentation and evaluation. And every stakeholder in the village, from the product manager to the subject matter expert, wants to make sure that the application meets their business requirements and accuracy expectations, as well as standards for safety, security, and compliance.
So Galileo set about building a holistic "trust layer" using what Sheth dubbed "evaluative intelligence": the right metrics and infrastructure needed to measure holistically how AI applications are performing.
Thus was born Galileo: a platform that helps developers build, ship, and scale reliable LLM-based generative and agentic AI applications by letting them easily observe and measure behavior, evaluate models against application-specific benchmarks, automatically surface insights, and quickly mitigate any problems, such as hallucination, so that the applications behave as intended.
Sheth, now Galileo's chief operating officer, estimates that customers in North America, Europe, and Asia have de-risked more than 1,000 AI applications using his company's platform since its launch in 2023.
From the very start, Sheth and Galileo's engineers chose to build the internal technology infrastructure on Google Cloud.
"We launched our platform on NVIDIA GPUs on Google Cloud in just a few months, and in two years we've scaled Galileo to handle a petabyte of data, more than 5,000 concurrent end-users, and over 20 million requests per day at 300-millisecond latencies," Sheth reports. "We couldn't do that without the power of the Google Cloud. It gives us a set of interlocking tools that work in harmony."
Among those tools are Vertex AI and Gemini. Many of Galileo's "evaluation agents" leverage LLMs to expedite and automate the large-scale experimentation and measurement Galileo performs, and Sheth reports that many of his customers use Galileo to de-risk agents built with Vertex AI and Gemini.
We've quickly scaled Galileo for over 5,000 concurrent users and more than 20 million requests per day at 300-millisecond latencies. We couldn't do that without the power of Google Cloud.
Yash Sheth
Co-founder and Chief Operating Officer, Galileo
Galileo also makes use of the Google Kubernetes Engine (GKE), Cloud SQL, and Cloud Storage. "We have a global client base, so a multi-region deployment is essential for minimizing latency and complying with data storage regulations," Sheth explains. "GKE, SQL, and Storage were very easy to set up and scale, and they keep latency very low by allowing data to be housed close to each customer while also ensuring its safety and security."
Sheth's team uses Vector Search for retrieval-augmented generation, or RAG, to build new algorithms and evaluation analytics by accessing and incorporating information from external databases. And Galileo's platform integrates seamlessly with BigQuery, which many of its customers use to store and create a lineage for the data underlying the AI applications they're developing.
I think of the Google team as partners, not vendors. As a startup, the go-to-market support we get from Google is invaluable.
Yash Sheth
Co-founder and Chief Operating Officer, Galileo
Sheth began his career at Google, where he worked on an early version of Gemini right out of graduate school. "Being an ex-Googler, I truly value its culture, and although I'm an entrepreneur at heart, I miss it," he admits. "Working on Galileo with the team at Google has been like a reunion. I think of them as partners, not vendors. They've been welcoming and supportive, and their programs for startups have been especially helpful."
The Google Cloud ISV Startup Springboard, for example, has helped Galileo elevate its market presence, as has being fast-tracked into the Google Cloud Marketplace.
And Sheth and his team have networked with other tech entrepreneurs at Google Cloud's annual "Next" conference, which among many other things, hosts content and events specifically for startups.
"As a startup, the go-to-market support we get from Google is invaluable," Sheth notes, "and we expect it to unlock more opportunities going forward."
The partnership between the Google for Startups Cloud Program and NVIDIA Inception for Startups has also been of benefit to Galileo, which receives credits from Google to use toward NVIDIA GPUs. The two companies also promote Galileo to their own customers.
"Google's strategic partnership with NVIDIA is of tremendous help to Galileo," Sheth reports. "Because Galileo can function like a bridge between Google Cloud and NVIDIA's NeMo, and we can work closely with both companies, Galileo can help all our customers go into production with their AI applications much faster."
Galileo's engineers also leverage the interoperability between Google's and NVIDIA's technologies for Galileo itself. "Google and NVIDIA give us a comprehensive, scalable software stack and the hardware and managed services we need to develop the highly efficient inference engine at the heart of Galileo."
Galileo's roadmap includes the development of tools for de-risking multi-agent systems, in which two or more AI agents cooperate to solve complex problems more quickly through reinforcement learning. They're also adding support for multimodal AI use cases, which are able to process not only text but also images, audio, and video.
Galileo has made its enterprise-grade platform available to individual developers and small teams for free to enable more engineers to de-risk their AI applications and launch them with confidence.
"We've spoken to many developers who've told us how hard it is to build reliable AI applications with open-source evaluation tools, so we decided to make it easy for anyone to try our platform," Sheth says. "And we'll be able to handle the increased volume of users thanks to the scalability of Google Cloud."
We're making Galileo available for free to small development teams, and we'll be able to handle the increased volume of users thanks to the scalability of Google Cloud.
Yash Sheth
Co-founder and Chief Operating Officer, Galileo
These and other efforts will be steered by an 80-person-and-growing team at this Series B startup, which has doubled in size in just a year and has offices in New York and San Francisco, as well as a development hub in Bangalore, India.
"When companies use LLM-based apps to assist customers, it's important that those apps perform as expected. Hallucinations, errors, and vulnerabilities can erode trust quickly. Google Cloud helps Galileo ensure that our customers' AI applications are reliable."
Galileo is a reliability and evaluation platform that helps teams of any size build AI apps they can trust.
Industries: Startup, Technology
Location: United States
Products: BigQuery, Gemini, Cloud SQL, Cloud GPUs, Cloud Storage, Google Cloud, Google Kubernetes Engine, Vector Search, Vertex AI
About Google Cloud partner — NVIDIA
NVIDIA’s (NASDAQ: NVDA) invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. NVIDIA partners with Google Cloud on NVIDIA GPU platforms including NVIDIA T4 and NVIDIA V100.