Jump to Content
Data Analytics

Unleash the power of generative AI with BigQuery and Vertex AI

February 29, 2024
Gerrit Kazmaier

VP & GM of Data Analytics, Google Cloud

Try Gemini 1.5 Pro

Google's most advanced multimodal model in Vertex AI

Try it

Organizations dream of unlocking new insights and efficiencies with AI. To do this, they need a data and AI platform that makes it easy and seamless to access all enterprise data, both structured and unstructured, in a secure and governed way.

To help customers accomplish this, we are announcing innovations that further connect data and AI with increased scale and efficiency using BigQuery and Vertex AI, allowing you to:

  • Simplify multimodal generative AI for enterprise data by making Gemini models available through BigQuery ML
  • Unlock value from unstructured data by expanding BigQuery integration with Vertex AI’s document processing and speech-to-text APIs
  • Build and unleash AI-powered search of your business data with vector search in BigQuery

Bringing AI directly to your data using first-party model integration with BigQuery and Vertex AI democratizes the power of generative AI to all data teams and allows you to seamlessly activate your enterprise data with large language models. This makes building AI-driven analytics simpler, faster and more secure, while taking advantage of BigQuery’s unique serverless architecture for scale and efficiency.

Simplify generative AI use cases with Gemini models

BigQuery ML lets you create, train and execute machine learning models in BigQuery using familiar SQL. With customers running hundreds of millions of prediction and training queries every year, usage of built-in ML in BigQuery grew 250% YoY1.

Today, we are taking BigQuery one step further with Gemini 1.0 Pro integration via Vertex AI. The Gemini 1.0 Pro model is designed for higher input/output scale and better result quality across a wide range of tasks like text summarization and sentiment analysis. You can now access it using simple SQL statements or BigQuery’s embedded DataFrame API from right inside the BigQuery console.

This enables you to build data pipelines that blend structured data, unstructured data and generative AI models together to create a new class of analytical applications. For example, you can analyze customer reviews in real-time and combine them with purchase history and current product availability to generate personalized messages and offers, all right inside BigQuery. You can learn more about BigQuery and Gemini models integration here.

In the coming months, we plan on helping customers unlock multimodal generative AI use cases by expanding the support for Gemini 1.0 Pro Vision model. This provides you the ability to analyze images, videos, and other complex data using familiar SQL queries. For example, if you are working with a large image dataset in BigQuery, you will be able to leverage the Gemini 1.0 Pro Vision model to generate image descriptions, categorize them for better search, annotate key features, colors, aesthetics, and much more.

Unlocking value from unstructured data with AI

Unstructured data such as images, documents, and videos represent a large portion of untapped enterprise data. However, unstructured data can be challenging to interpret, making it difficult to extract meaningful insights from it.

BigLake unifies data lakes and warehouses under a single management framework, enabling you to analyze, search, secure, govern and share unstructured data. With increasing data volumes, customer use of BigLake has grown to hundreds of petabytes. Leveraging the power of BigLake, customers are already analyzing images using a broad range of AI models including Vertex AI’s vision APIs, open-source TensorFlow Hub models, or their own custom models.

We are now expanding these capabilities to help you easily extract insights from documents and audio files using Vertex AI’s document processing and speech-to-text APIs. With these new capabilities, you can create generative AI applications for content generation, classification, sentiment analysis, entity extraction, summarization, embeddings generation, and more.

For example, you can perform deeper financial performance analysis by deriving information like revenue, profit and assets from financial reports and combining it with a BigQuery dataset that contains historical stock performance. Similarly, you can improve customer service by analyzing customer support call recordings for sentiment, identifying common issues, and correlating the call insights with purchase history..

Improve vector search with your unstructured data

Earlier this month, we announced the preview of BigQuery vector search integrated with Vertex AI to enable vector similarity search on your BigQuery data. This functionality, also commonly referred to as approximate nearest-neighbor search, is key to empowering numerous new data and AI use cases such as semantic search, similarity detection, and retrieval-augmented generation (RAG) with a large language model (LLM). Vector search can also enhance the quality of your AI models by improving context understanding, reducing ambiguity, ensuring factual accuracy, and allowing adaptability to different tasks and domains.

For example, vector search can help retailers improve product recommendations to customers. Imagine a shopper looking at a picture of a red dress on the retailer’s e-commerce website. With a vector search, shoppers have the ability to search for their stylistic preference such as the color, cut, maybe even the occasion. With vector search, the retailer can automatically suggest other dresses that are similar, even if they don't have identical descriptions. This way, shoppers find what they’re looking for more easily, and retailers can show things shoppers are more likely to buy.

Built on our text embeddings capabilities, and adhering to your AI governance policies and access controls, BigQuery vector search unlocks new data and AI use cases such as:

  • Retrieval-augmented generation (RAG): Retrieve data relevant to a question or task and provide it with context to an LLM. For example, use a support ticket to find ten closely-related previous cases, and pass them to an LLM as context to summarize and suggest a resolution.
  • Semantic search: Find semantically similar documents to a given query, even if the documents do not contain the exact same words. This is useful for tasks such as finding related articles, similar products, or answers to questions.
  • Text clustering: Cluster documents into groups of similar documents. This is useful for tasks such as organizing documents, finding duplicate documents, or identifying trends in a corpus of documents.
  • Summarization: Summarize documents by finding the most similar documents to the original document and extracting the main points. This is useful for tasks such as generating executive summaries, creating abstracts, or summarizing news articles.

Join us for the future of data and generative AI

When it comes to augmenting your business data with generative AI, we’re just getting started. To learn more, sign up for the upcoming Data Cloud Innovation Live webcast on March 7, 2024, 9 - 10 AM PST. And be sure to join us at Next ’24 to get the inside track on all the latest product news and innovations to accelerate your transformation journey this year.


1. Usage of built-in ML in BigQuery grew 250% YoY between July 2022 and 2023.

Posted in