Multimodal gen AI models automate telecom field ops and enhance customer experience
Krishnamurthy Srinivasan
Head of Telecom Analytics and AI
Vijay Jayapalan
Cloud Principal Architect, Telecommunications
Field operations can be one of the most complex aspects of a telecommunications business. Whether it’s because of process inefficiencies, inaccurate information, lack of proper equipment, or inexperience with an underlying issue, telco field service technicians often face delays when dealing with customer issues, forcing them to seek help from senior technical professionals. As a result, rather than servicing new customers or supporting existing ones, technicians spend time and resources on non-revenue-generating activities, which can reinforce an already less-than-perfect customer experience.
Envisioning solutions for field technicians
At Google Cloud, working to understand the constraints that technicians face in carrying out their work has helped us identify possible ways to increase their effectiveness.
For example, when troubleshooting a device, it’s often easier for a technician to upload a picture of the device and its surroundings rather than to describe the conditions in writing. In addition, if the technician is away from their vehicle, it can be hard for them to use a keyboard. Accordingly, audio, video, and images may be their preferred modalities for looking up or updating information. One common occurrence where we see this is when working to identify the root cause of an issue, where combining the information in the images with complex troubleshooting logic in field technician manuals or the device vendor documents is often required.
At the same time, technicians don’t always have time to capture all the details of the steps they took to fix a problem manually. Generative AI can automatically generate a draft transcript of their work, capturing customer incidents and their resolutions for future reference.
A bot that helps technicians with their current assignments could also significantly reduce errors and the time they spend waiting for help from senior technical professionals.
Google Cloud’s multimodal Gemini family of models offers exciting possibilities for improving field operations, which could in turn help Communications Service Providers (CSPs) increase customer satisfaction. Gemini can understand and reason about text, images, video, and code. It has multimodal reasoning capabilities and the ability to extract insights from documents through reading, filtering, and understanding information. Built into a multimodal AI assistant, we believe Gemini could help address the challenges faced by field technicians, increasing their effectiveness, the quality of their work, and customer satisfaction — while reducing the cost of redundant work and mistakes.
The below illustration highlights potential cross-functional benefits of a multimodal AI assistant to a technician, dispatcher/operator, and customer:
Multimodal, semantic large language model intelligence
So, how could we provide an assistant for field technicians with these sorts of multimodal capabilities? Google Cloud has offered AI tools that work across a variety of modalities, including Vision AI and Natural Language AI. Now, Gemini helps make it possible to rapidly develop multimodal search applications for a variety of purposes, such as identifying defects to be remedied based on complex multi-step reasoning using verbal and visual input, retrieving troubleshooting information for a specific device model based on its image, and generating rich instructions or postmortem documents.
Google Cloud’s Vertex AI Search and Conversation can leverage Gemini’s multimodal capabilities, as well as retrieve or update information from other systems (e.g., inventory management, or customer relationship management systems). For technicians, an app with these such capabilities could reduce manual effort and improve the consistency and quality of their work.
For example, with support for Gemini, Vertex AI Conversation could be used to:
- Identify the correct procedure a technician should follow based on the task at hand (e.g., providing a new home internet connection or troubleshooting poor performance), and dynamically guide the technician through the procedure through bidirectional multimodal interactions.
- Auto-generate comprehensive and accurate postmortem documentation of a completed job from voice, text, and visual input on the various steps taken.
- Retrieve, summarize, and generate voice or video instructions to technicians based on the images they upload and their voice requests.
Transforming the way we work and operate, while increasing customer satisfaction
Multimodal analysis combined with orchestration of complex tasks is a powerful new technology that has the potential to revolutionize the way we interact with information. By combining the power of Gemini with the richness of visual data, multimodal search can help us find information more quickly and easily, and understand the world around us in new and exciting ways.
For the telco community, multimodal analysis offers exciting opportunities to enhance field technicians’ effectiveness, increase their morale, and improve customer satisfaction — while also offering savings to CSPs by improving “first-time-right” metrics, reducing mean-time-to-repair, and reducing the number of truck rolls required to fix an issue.
Start creating exciting solutions with Gemini today. Learn how Google Cloud’s unified AI stack can help you to rapidly develop, deploy and scale rich generative AI applications, including super-scalable AI infrastructure, 130+ curated world-class models, and a rich set of development tools. Gemini in Vertex AI benefits from Google Cloud features for helping to secure enterprise security, safety, privacy, data governance and compliance. Google AI Studio, a free, web-based developer tool allows you to experience Gemini’s capabilities (even if you do not have a Google Cloud account!), and move your work to Vertex AI when you’re ready. Be inspired by these examples of what Gemini’s complex reasoning capabilities can do. If you’re a developer, you will find this repository that includes some industry-specific solutions valuable.
Learn more about how we are helping CSPs transform with AI here, and download our latest study on using AI to win the customer experience battle in telecommunications.