Jump to Content
Telecommunications

Multimodal gen AI models automate telecom field ops and enhance customer experience

January 24, 2024
Krishnamurthy Srinivasan

Head of Telecom Analytics and AI

Vijay Jayapalan

Cloud Principal Architect, Telecommunications

Try Gemini 1.5 models

Google's most advanced multimodal models in Vertex AI

Try it

Field operations can be one of the most complex aspects of a telecommunications business. Whether it’s because of process inefficiencies, inaccurate information, lack of proper equipment, or inexperience with an underlying issue, telco field service technicians often face delays when dealing with customer issues, forcing them to seek help from senior technical professionals. As a result, rather than servicing new customers or supporting existing ones, technicians spend time and resources on non-revenue-generating activities, which can reinforce an already less-than-perfect customer experience.

Envisioning solutions for field technicians

At Google Cloud, working to understand the constraints that technicians face in carrying out their work has helped us identify possible ways to increase their effectiveness.

For example, when troubleshooting a device, it’s often easier for a technician to upload a picture of the device and its surroundings rather than to describe the conditions in writing. In addition, if the technician is away from their vehicle, it can be hard for them to use a keyboard. Accordingly, audio, video, and images may be their preferred modalities for looking up or updating information. One common occurrence where we see this is when working to identify the root cause of an issue, where combining the information in the images with complex troubleshooting logic in field technician manuals or the device vendor documents is often required.

At the same time, technicians don’t always have time to capture all the details of the steps they took to fix a problem manually. Generative AI can automatically generate a draft transcript of their work, capturing customer incidents and their resolutions for future reference.

A bot that helps technicians with their current assignments could also significantly reduce errors and the time they spend waiting for help from senior technical professionals.

Google Cloud’s multimodal Gemini family of models offers exciting possibilities for improving field operations, which could in turn help Communications Service Providers (CSPs) increase customer satisfaction. Gemini can understand and reason about text, images, video, and code. It has multimodal reasoning capabilities and the ability to extract insights from documents through reading, filtering, and understanding information. Built into a multimodal AI assistant, we believe Gemini could help address the challenges faced by field technicians, increasing their effectiveness, the quality of their work, and customer satisfaction — while reducing the cost of redundant work and mistakes.

The below illustration highlights potential cross-functional benefits of a multimodal AI assistant to a technician, dispatcher/operator, and customer:

https://storage.googleapis.com/gweb-cloudblog-publish/images/2._Blog_diagram.max-1200x1200.jpg

Multimodal, semantic large language model intelligence

So, how could we provide an assistant for field technicians with these sorts of multimodal capabilities? Google Cloud has offered AI tools that work across a variety of modalities, including Vision AI and Natural Language AI. Now, Gemini helps make it possible to rapidly develop multimodal search applications for a variety of purposes, such as identifying defects to be remedied based on complex multi-step reasoning using verbal and visual input, retrieving troubleshooting information for a specific device model based on its image, and generating rich instructions or postmortem documents.

Google Cloud’s Vertex AI Search and Conversation can leverage Gemini’s multimodal capabilities, as well as retrieve or update information from other systems (e.g., inventory management, or customer relationship management systems). For technicians, an app with these such capabilities could reduce manual effort and improve the consistency and quality of their work.

For example, with support for Gemini, Vertex AI Conversation could be used to:

  • Identify the correct procedure a technician should follow based on the task at hand (e.g., providing a new home internet connection or troubleshooting poor performance), and dynamically guide the technician through the procedure through bidirectional multimodal interactions.
  • Auto-generate comprehensive and accurate postmortem documentation of a completed job from voice, text, and visual input on the various steps taken.
  • Retrieve, summarize, and generate voice or video instructions to technicians based on the images they upload and their voice requests.

Transforming the way we work and operate, while increasing customer satisfaction

Multimodal analysis combined with orchestration of complex tasks is a powerful new technology that has the potential to revolutionize the way we interact with information. By combining the power of Gemini with the richness of visual data, multimodal search can help us find information more quickly and easily, and understand the world around us in new and exciting ways.

For the telco community, multimodal analysis offers exciting opportunities to enhance field technicians’ effectiveness, increase their morale, and improve customer satisfaction — while also offering savings to CSPs by improving “first-time-right” metrics, reducing mean-time-to-repair, and reducing the number of truck rolls required to fix an issue.

Start creating exciting solutions with Gemini today. Learn how Google Cloud’s unified AI stack can help you to rapidly develop, deploy and scale rich generative AI applications, including super-scalable AI infrastructure, 130+ curated world-class models, and a rich set of development tools. Gemini in Vertex AI benefits from Google Cloud features for helping to secure enterprise security, safety, privacy, data governance and compliance. Google AI Studio, a free, web-based developer tool allows you to experience Gemini’s capabilities (even if you do not have a Google Cloud account!), and move your work to Vertex AI when you’re ready. Be inspired by these examples of what Gemini’s complex reasoning capabilities can do. If you’re a developer, you will find this repository that includes some industry-specific solutions valuable.

Learn more about how we are helping CSPs transform with AI here, and download our latest study on using AI to win the customer experience battle in telecommunications.

Posted in