Google Cloud advances generative AI at I/O: new foundation models, embeddings, and tuning tools in Vertex AI
VP, Cloud AI & Industry Solutions
Generative AI has unleashed a new breed of digital assistants, content creation tools, and applications, changing how apps are built, who can build them, and the capabilities end users expect from them.
Google is a leader in this field, from the creation of Google’s Transformer architecture that makes generative AI possible, to today’s announcement of PaLM 2, our next-generation language model with improved multilingual, reasoning, and coding capabilities. At Google Cloud, we’re committed to bringing the power of these transformational foundation models to our customers and empowering developers to innovate in entirely new ways.
We took a big step in this journey in March with our first two major announcements: Gen App Builder, which lets developers, even those with limited machine learning experience, quickly and easily create generative chat and search apps; and Generative AI support in Vertex AI, which expands our machine learning development platform with access to both foundation models and APIs in the new Model Garden, as well as a variety of tools to customize and experiment with models in Generative AI Studio.
Today, at Google I/O 2023, we’re excited to build on these offerings with a variety of announcements that give customers access to new generative modalities and expanded ways to leverage and tune models, including:
Three new foundation models are available in Vertex AI, where they can be accessed via API, tuned through a simple UI in Generative AI Studio, or deployed to a data science notebook.
Codey, our text-to-code foundation model, can be embedded in an SDK or application to help improve developer velocity with code generation and code completion, and to improve code quality.
Imagen, our text-to-image foundation model, lets organizations generate and customize studio-grade images at scale for any business need.
Chirp, our speech-to-text foundation model, helps organizations to more deeply and inclusively engage with their customers in their native languages with captioning and voice assistance.
Embeddings APIs for text and images help developers build recommendation engines, classifiers, question-answering systems, and other sophisticated applications based on semantic understanding of text or images.
Reinforcement Learning from Human Feedback (RLHF) extends Vertex AI’s tuning and prompt design capabilities by letting organizations incorporate human feedback to customize and improve model performance.
Generative AI Studio, Model Garden, and PaLM 2 for Text and Chat are moving from trusted tester availability to preview, meaning everyone with a Google Cloud account has access.
These announcements are the next step in our journey to help developers build boldly and responsibly with generative AI technology, backed by enterprise-grade safety, security, and privacy. Let’s get into the details of the news.
New foundation models give developers and data scientists more capabilities to build generative AI applications
Code completion: Codey suggests the next few lines based on the context of code entered into the prompt.
Code generation: Codey generates code based on natural language prompts from a developer.
Code chat: Codey lets developers converse with a bot to get help with debugging, documentation, learning new concepts, and other code-related questions.
The second foundation model is Imagen, which lets customers generate and edit high-quality images for any business need. This text-to-image model makes it easy to create and edit high-quality images at scale with low latency and enterprise-grade data governance. With Vertex AI, organizations can customize and adapt Imagen to their business needs by generating images with their own content, such as existing products or logos. Leveraging the power of mask-free editing, image upscaling, and image captioning across over 300 languages, customers can quickly generate production ready images.
With Imagen on Vertex AI, creating studio-grade images is now as simple as typing a few words as a prompt—and modifying the image, such as changing an object’s color, takes only a few more words. Imagen also includes the ability to caption and classify the image with the perfect description, and built-in content moderation is supported by best practices for safety. Moreover, any image generated on Vertex AI is the customer’s data and can be used by the organization for things like marketing collateral.
To generate new images of its own products, an organization can upload existing images, with the security and governance controls already built into Vertex AI to keep data safe. Generated images can be infinitely iterated, upscaled to the required resolution, and easily augmented with captions and metadata.
The third foundation model we are introducing is Chirp, which helps organizations engage with customers and constituents more inclusively in their native languages. Whether it’s connecting with contact center virtual agents in Spanish, captioning videos spoken in Xhosa, or offering voice assistance in Balinese, Chirp brings the power of large models to speech tasks ranging from voice control to captioning to voice assistance.
Trained on millions of hours of audio, Chirp is a version of our 2 billion-parameter speech model that supports over 100 languages and brings the model quality of the world’s most widely-spoken languages to scores of additional languages and dialects. Chirp achieves 98% accuracy on English and relative improvement of up to 300% in languages with less than 10 million speakers.
Embeddings API: Find new relationships in data and fuel sophisticated generative AI applications
Embeddings APIs for text and images are now available in Vertex AI, letting developers create more compelling apps and user experiences. Embeddings convert text and image data into multi-dimensional numerical vectors that map semantic relationships, can be processed by large models, and are particularly useful for longer inputs, such as texts with thousands of tokens.
Embeddings APIs are now available in Vertex AI, letting developers create more compelling apps and user experiences by building powerful semantic search and text classification functionality, creating Q&A chatbots based on an organization’s data, and improving clustering, anomaly detection, sentiment analysis, and more.
Embeddings API for text is available in preview, and trusted testers can leverage the APIs for both text and image.
Get more value from foundation models with RLHF
Vertex AI is the first end-to-end machine learning platform among the hyperscalers to offer RLHF as a managed service offering, helping organizations to cost-efficiently maintain model performance over time and deploy safer, more accurate, and more useful models to production.
This unique tuning feature lets organizations incorporate human feedback to train a reward model that can be used to finetune foundation models. This is particularly useful in industries where accuracy is crucial, such as healthcare, or customer satisfaction is critical, such as finance and e-commerce, as it ultimately leads to higher customer satisfaction and engagement. It also lets humans more accurately review the model responses for bias, toxic content, or other dimensions, teaching the model to avoid inappropriate outputs.
With our new foundation models available in Vertex AI and our expanding toolset for customizing and leveraging those models, we’re continuing to transform how organizations across all industries and levels of technical expertise build and interact with AI in the cloud.
Codey, Imagen, Embeddings API for images, and RLHF are available in Vertex AI through our trusted tester program, and Chirp, PaLM 2, Embeddings API, and Generative AI Studio for text are available in preview in Vertex AI to everyone with a Google Cloud account.
We look forward to continuing this exciting journey with our customers—to learn about some of our customer conversations to date, and to keep pace with all the latest AI news from Google and Google Cloud, be sure to check out The Prompt on Transform with Google Cloud.