Google Cloud supercharges NLP with large language models
Senior Software Engineering Manager
Natural language understanding (NLU) is getting increasingly better at solving complex problems and these language breakthroughs are creating big waves in Artificial Intelligence. For example, new language models are enabling Everyday Robots to create more helpful robots that can break down user instructions and have even enabled people to generate imaginative visuals from complex text prompts.
These leaps in NLU are powered by neural networks trained to understand human language. This technology has greatly advanced since the introduction of Google’s Transformer architecture in 2017 with the introduction of large models trained on massive amounts of data like GPT-3 and, even more recently, with GLaM, LaMDA, and PaLM. This latest generation of models are called Large Language Models (LLMs) because of their sheer size and the vast volumes of data on which they are trained, and they can be applied to a range of tasks to create more powerful digital assistants, generate better search results and product recommendations, enforce smarter platform curation and safety features, and much more.
For these reasons, we’re pleased to announce we’ve updated the Google Cloud Natural Language (NL) API with a new LLM-based model for Content Classification.
With an expansive pre-trained classification taxonomy, the newest version of Content Classification from the Natural Language API leverages the latest Google research to improve customer use cases spanning actionable insights on user trends, to ad targeting, to content-based filtering. In this article, we’ll explore the NL API’s new capabilities, which are the first of many efforts we’ll be making to bring the power of LLMs to Google Cloud.
How LLMs help machines understand human language
As Google Cloud VP and General Manager of AI and Industry Solutions, Andrew Moore has argued, if computer systems become more conversant with natural human languages, they become a foundation for more sophisticated use cases, able to not only understand user intent but also create complex bespoke solutions. Google has been a leading research force in this space, with LLM projects like LaMDA, PaLM and T5 contributing to the Cloud NL API’s improved v2 classification model.
Parsing language is a difficult AI task for machines due in part to the contextual and individual interpretation of words or phrases. The word “server,” for example, could refer to a computer, a restaurant employee, or a tennis player. To understand the word, a model needs to be trained around not only a basic definition but also the context and positioning of the word within a sentence or conversation and its evolving connotations. Because they process voluminous training data via Transformers, LLMs are well-suited to this type of work.
Thanks to the integration of Google’s latest language modeling technology, and an updated and expanded training data set, the next generation of the Content Classification API not only has over 1,000 labels (up from around 600 previously), but now also supports 11 languages (with Chinese, French, German, Italian, Japanese, Korean, Portuguese, Russia, Spanish, and Dutch joining previously-available English)—and does so with improved accuracy.
AI raises questions about the best way to build fairness, interpretability, privacy, and security into these new systems in order to benefit people and society. At Google, we prioritize the responsible development of AI and take steps to offer products where a responsible approach is built in by design. For Content Classification, we limited use of sensitive labels and conducted performance evaluations. See our Responsible AI page for more information about our commitments to responsible innovation.
Today’s announcement is just the first step in bringing LLM capabilities to Google Cloud AI products, and we’re excited to see how our more powerful Natural Language API helps developers, analysts and data scientists generate insights and offer superior experiences. Our early adopters are implementing the API to improve user recommendations, display ad targeting, and insights about new trends.
If you’re ready to get started with this major leap in Google Cloud language services, visit our NL API documentation, and to learn more about Google Cloud’s AI services, visit our AI and machine learning products page.