Containers & Kubernetes

LiveX AI reduces customer support costs by up to 85% with AI agents trained and served on GKE and NVIDIA AI

July 22, 2024

Jia Li

Co-Founder, President & Chief AI Officer, LiveX AI

Shubhika Taneja

Product Marketing Manager, Google Kubernetes Engine, Google Cloud

Providing a satisfying customer experience is a critical competitive advantage for consumer companies, but delivering it comes with multiple challenges. Despite attracting visitors to a website, it can be a struggle to convert them into customers if the site lacks that personal touch. Call centers are costly to operate, and when call volume is high, customers get frustrated with lengthy wait times. Traditional chatbots are more scalable, but fall far short of a real human-to-human experience.

LiveX AI stands at the cutting edge of generative AI technology, building custom, multimodal AI agents that can see, hear, chat, and show to deliver truly human-like customer experiences. Founded by a team of experienced entrepreneurs and distinguished tech leaders, LiveX AI provides businesses with trusted AI agents that deliver strong customer engagement across a variety of platforms.

LiveX AI generative AI agents provide real-time, immersive, human-like customer experience that offer helpful, real-time solutions to customer questions and concerns in a familiar, conversational manner. And to give users a good experience, the agents need to be robust as well as fast. Creating that user experience requires a highly performant and scalable platform capable of eliminating the response lag typical of many AI agents — especially during peak volume periods like Black Friday.

GKE provides a robust foundation for advanced generative AI applications

From the start, Google Cloud and LiveX AI collaborated to help jumpstart LiveX AI’s development, using Google Kubernetes Engine (GKE) and the NVIDIA AI platform. In a matter of three weeks, Google Cloud had helped LiveX AI deliver a custom solution for its client. Additionally, by participating in the Google for Startups Cloud Program and NVIDIA Inception program, LiveX AI had their cloud costs covered while they got up and running, and received access to additional business and technical resources.

The LiveX AI team wanted a robust solution that would help them ramp up quickly, so it chose GKE, which lets them deploy and operate containerized applications at scale on a secure and performant global infrastructure. GKE’s platform orchestration capabilities make it easy to train and serve optimized AI workloads on NVIDIA GPUs while taking advantage of GKE’s flexible integration with distributed computing and data processing frameworks.

In particular, GKE Autopilot makes it easy to scale applications to different clients, especially when building multimodal AI agents for brands with high volumes of real-time customer interactions. GKE Autopilot manages the underlying compute of a Kubernetes cluster without LiveX AI needing to configure or monitor it. With the help of GKE Autopilot, LiveX AI has achieved over 50% lower TCO, 25% faster time-to-market and 66% lower operational cost, helping them focus on delivering client value rather than configuring or monitoring the system.

GKE Autopilot helped LiveX AI achieve over 50% lower TCO, 25% faster time-to-market and 66% lower operational cost

Tweet this quote

One of these clients is Zepp Health, a direct-to-consumer (D2C) manufacturer of wellness devices, which engaged with LiveX AI to build an AI customer agent for the U.S. e-commerce site for their Amazfit smartwatches and smart rings. The agent needed to smoothly process high volumes of customer interactions and deliver personalized experiences for customers in real time.

For the Amazfit project, GKE was paired with A2 Ultra VMs powered by NVIDIA A100 80GB Tensor Core GPUs, as well as NVIDIA NIM inference microservices. Part of the NVIDIA AI Enterprise software platform, NIM provides a set of easy-to-use microservices designed for the secure, reliable deployment of high-performance AI model inference.

NVIDIA NIM Docker containers deployed on GKE using Infrastructure as Code (IaC) practices sped initial deployment and streamlined upgrading applications once they were in production. NVIDIA hardware acceleration technologies significantly helped the development and deployment processes, maximizing the impact from hardware optimization.

All told, by leveraging GKE with NVIDIA NIM and NVIDIA A100 GPUs, LiveX AI was able to achieve a remarkable 6.1x acceleration in average answer/response generation speed for the Amazfit AI agent compared to running it on another popular inference platform. Better yet, the project was delivered in just three weeks.

Compared with another inference platform, running on GKE with NVIDIA NIM and GPUs delivered 6.1x acceleration in average answer/response generation speed for the Amazfit AI agent

Tweet this quote

For LiveX AI customers, this means:

Up to an 85% reduction in customer support costs given efficient AI-driven resolutions
Significant improvements in first-response times, from industry standards of hours to mere seconds
Enhanced customer satisfaction, reducing returns by approximately 15% due to more accurate and timely resolutions
Improved lead conversion by 5x with an intelligent, actionable AI agent.

“At Zepp Health, we believe in delivering a personal touch in every customer interaction,” says Wayne Huang, CEO of Zepp Health. “LiveX AI brings that ethos to life by providing a seamless and engaging shopping experience for our customers shopping for Amazfit. By embedding LiveX AI, we're ensuring that every visitor to our website receives personalized assistance tailored to their needs and preferences.”

Collaboration that fuels AI innovation

At the end of the day, GKE has allowed LiveX AI to ramp up quickly and deliver innovative generative AI solutions to customers that offer immediate value. As a secure, scalable, and cost-effective platform for deploying and managing containerized applications, GKE provides a robust foundation for the development and deployment of advanced generative AI applications. It simplifies the development process by allowing for easier creation and management of clusters, accelerating developer productivity and increasing application reliability through automated scaling, load balancing, and self-healing functionalities.

At the same time, Google Cloud’s partnership with NVIDIA signifies its commitment to making generative AI more accessible. With this combination, LiveX can create custom AI agents that create a competitive advantage for its customers, helping them deliver personalized experiences in real time, including seamless customer support, instant product recommendations, and reduced returns.

Want to explore how Google Cloud and NVIDIA can support your startup? Learn more about the Google for Startups Cloud Program and NVIDIA Inception program and apply today.

Posted in

Containers & Kubernetes

How we cut Vertex AI latency by 35% with GKE Inference Gateway

By Fisayo Feyisetan • 4-minute read

Containers & Kubernetes

Accelerate GKE cluster autoscaling with faster concurrent node pool auto-creation

By Daniel Kłobuszewski • 4-minute read

Containers & Kubernetes

Accelerate model downloads on GKE with NVIDIA Run:ai Model Streamer

By Peter Schuurman • 4-minute read

Containers & Kubernetes

How Google Does It: Building the largest known Kubernetes cluster, with 130,000 nodes

By Besher Massri • 10-minute read

LiveX AI reduces customer support costs by up to 85% with AI agents trained and served on GKE and NVIDIA AI

Jia Li

Shubhika Taneja

GKE provides a robust foundation for advanced generative AI applications

GKE Autopilot helped LiveX AI achieve over 50% lower TCO, 25% faster time-to-market and 66% lower operational cost

Compared with another inference platform, running on GKE with NVIDIA NIM and GPUs delivered 6.1x acceleration in average answer/response generation speed for the Amazfit AI agent

Collaboration that fuels AI innovation

Related articles

How we cut Vertex AI latency by 35% with GKE Inference Gateway

Accelerate GKE cluster autoscaling with faster concurrent node pool auto-creation

Accelerate model downloads on GKE with NVIDIA Run:ai Model Streamer

How Google Does It: Building the largest known Kubernetes cluster, with 130,000 nodes