Generative AI

Documentation and resources for building and implementing generative AI applications with Google Cloud tools and products.

Get started for free

Start your proof of concept with $300 in free credit

Get access to Gemini 2.0 Flash Thinking
Free monthly usage of popular products, including AI APIs and BigQuery
No automatic charges, no commitment

View free product offers

Keep exploring with 20+ always-free products

Access 20+ free products for common use cases, including AI APIs, VMs, data warehouses, and more.

Learn about building generative AI applications

Generative AI on Vertex AI

Access Google's large generative AI models so you can test, tune, and deploy them for use in your AI-powered applications.

Gemini Quickstart

See what it's like to send requests to the Gemini API through Google Cloud's AI-ML platform, Vertex AI.

AI/ML orchestration on GKE

Leverage the power of GKE as a customizable AI/ML platform featuring high performance, cost effective serving and training with industry-leading scale and flexible infrastructure options.

When to use generative AI

Identify whether generative AI, traditional AI, or a combination of both might suit your business use case.

Develop a generative AI application

Learn how to address the challenges in each stage of developing a generative AI application.

Code samples and sample applications

View code samples for popular use cases and deploy examples of generative AI applications that are secure, efficient, resilient, high-performing, and cost-effective.

Generative AI glossary

Learn about specific terms that are associated with generative AI.

Gen AI tools

Gen AI development flow

Model exploration and hosting

Google Cloud provides a set of state-of-the-art foundation models through Vertex AI, including Gemini. You can also deploy a third-party model to either Vertex AI Model Garden or self-host on GKE or Compute Engine.

Google Models on Vertex AI (Gemini, Imagen)

Discover test, customize, and deploy Google models and assets from an ML model library.

Other models in the Vertex AI Model Garden

Discover, test, customize, and deploy select OSS models and assets from an ML model library.

Text generation models via HuggingFace

Learn how to deploy HuggingFace text generation models to Vertex AI or Google Kubernetes Engine (GKE).

GPUs on Compute Engine

Attach GPUs to VM instances to accelerate generative AI workloads on Compute Engine.

Prompt design and engineering

Prompt design is the process of authoring prompt and response pairs to give language models additional context and instructions. After you author prompts, you feed them to the model as a prompt dataset for pretraining. When a model serves predictions, it responds with your instructions built in.

Vertex AI Studio

Design, test, and customize your prompts sent to Google's Gemini and PaLM 2 large language models (LLM).

Overview of Prompting Strategies

Learn the prompt-engineering workflow and common strategies that you can use to affect model responses.

Prompt Gallery

View example prompts and responses for specific use cases.

Grounding and RAG

Grounding connects AI models to data sources to improve the accuracy of responses and reduce hallucinations. RAG, a common grounding technique, searches for relevant information and adds it to the model's prompt, ensuring output is based on facts and up-to-date information.

Vertex AI grounding

You can ground Vertex AI models with Google Search or with your own data stored in Vertex AI Search.

Ground with Google Search

Use Grounding with Google Search to connect the model to the up-to-date knowledge available on the internet.

Vector embeddings in AlloyDB

Use AlloyDB to generate and store vector embeddings, then index and query the embeddings using the pgvector extension.

Cloud SQL and pgvector

Store vector embeddings in Postgres SQL, then index and query the embeddings using the pgvector extension.

Integrating BigQuery data into your LangChain application

Use LangChain to extract data from BigQuery and enrich and ground your model's responses.

Vector embeddings in Firestore

Create vector embeddings from your Firestore data, then index and query the embeddings.

Vector embeddings in Memorystore (Redis)

Use LangChain to extract data from Memorystore and enrich and ground your model's responses.

Agents and function calling

Agents make it easy to design and integrate a conversational user interface into your mobile app, while function calling extends the capabilities of a model.

AI Applications

Leverage Google's foundation models, search expertise, and conversational AI technologies for enterprise-grade generative AI applications.

Vertex AI Function calling

Add function calling to your model to enable actions like booking a reservation based on extracted calendar information.

Model customization and training

Specialized tasks, such as training a language model on specific terminology, might require more training than you can do with prompt design or grounding alone. In that scenario, you can use model tuning to improve performance, or train your own model.

Start building

Set up your development environment for Google Cloud

Set up LangChain

LangChain is an open source framework for generative AI apps that allows you to build context into your prompts, and take action based on the model's response.

View code samples and deploy sample applications

View code samples for popular use cases and deploy examples of generative AI applications that are secure, efficient, resilient, high-performing, and cost-effective.