Jump to Content
Data Analytics

Harnessing the power of PaLM in BigQuery

August 30, 2023
https://storage.googleapis.com/gweb-cloudblog-publish/images/GCN23_GE_BlogHeader_2436x1200_5.max-2500x2500.png
Abhinav Khushraj

Product Manager, Google Cloud

Xi Cheng

Engineering Manager

IDC estimates that by 2025, there will be 175 zettabytes of data in the world, and 80% of that data will be unstructured. However, 90% of unstructured data is never analyzed. That’s because it can be cumbersome, expensive and risky to extract and transform unstructured data, requiring multiple tools. As such, it’s rarely used in organizations’ data pipelines. 

Google Cloud’s recent innovations in generative AI, including foundation models for text and vision, open up various avenues for data teams to harness this untapped unstructured data. Object tables, a new table type in BigQuery, provides a structured record interface for unstructured data stored in Cloud Storage, unlocking additional possibilities.

Today, we are taking it one step further with the integration of BigQuery and Vertex AI foundation models, making it simple and easy for you to analyze unstructured data from right inside BigQuery. With the integration of BigQuery and Vertex AI foundation models, we are bringing generative AI directly to where your data resides. This approach has numerous benefits:

  • Eliminates the need to build and manage data pipelines between BigQuery and generative AI model APIs

  • Streamlines governance and helps reduce the risk of data loss by avoiding data movement 

  • Reduces the need to write and manage custom Python code to call AI models

  • Enables you to analyze data at petabyte-scale without compromising on performance

  • Can lower your total cost of ownership with a simplified architecture 

All this is made possible with BigQuery ML inference engine, which offers machine learning capabilities right inside BigQuery, and which recently became generally available. For each of the last two years, BigQuery ML has seen over 250% YoY query growth. This year, customers have run over 300 million prediction and training queries in BigQuery ML. 

Starting with the first supported foundation model, text analysis via PaLM 2 (text-bison), you can now write just a few lines of SQL in BigQuery ML to analyze unstructured data for advanced text processing tasks such as summarization or sentiment analysis, retrieve results in a structured format, and use it with other data for further analysis.

How does it work?

Under the hood, BigQuery ML’s inference engine uses ML.GENERATE_TEXT function to call Vertex AI text-bison models from the Model Garden. Here are two simple steps to use this feature:

1. Register the model as a remote model

Loading...

2. Run inference. Here’s an example where users can do data enrichment by obtaining the country name for a given city name. Note that “city” is a column in the “example_table”.

Loading...

How customers are leveraging PaLM in BigQuery

Early users of BigQuery and Vertex AI foundation model integration have expressed tremendous interest in solving various use cases across industries. For instance, using ML.GENERATE_TEXT can simplify advanced data processing tasks:

  • Content generation: Analyze customer feedback and generate personalized email content right inside BigQuery without the need for complex tools

  • Summarization: Summarize text stored in BigQuery columns such as online reviews or chat transcripts

  • Data enhancement: Obtain a country name for a given city name

  • Rephrasing: Correct spelling and grammar in textual content such as voice-to-text transcriptions

  • Feature extraction: Extract key information or words from the large text files such as in online reviews and call transcripts

  • Sentiment analysis: Understand human sentiment about specific subjects in a text

Faraday, a leading customer prediction platform, previously had to build data pipelines and join multiple datasets. Now, not only can they simplify sentiment analysis, but they can also take customer sentiment, join it with additional customer first-party data, and feed it back into the LLMs to generate hyper personalized content — all within BigQuery. Watch this demo video to learn more.

“Faraday's clients already get the benefit of predictions made from structured data. Now that Google has integrated BigQuery and Vertex AI foundation models, we can scalably predict business outcomes using unstructured data too..” - Seamus Abshere, CTO, Faraday. 

Getting started

To learn more,  visit the documentation page, or try out  this tutorial to extract keywords from text.

Posted in