Google Cloud offers a range of products and tools for the complete life cycle of building generative AI applications.

Learn about building generative AI applications

Generative AI on Vertex AI

Access Google's large generative AI models so you can test, tune, and deploy them for use in your AI-powered applications.

Gemini Quickstart

See what it's like to send requests to the Gemini API through Google Cloud's AI-ML platform, Vertex AI.

Choose infrastructure for your generative AI application

Choose the best products and tools for your use case and access the documentation you need to get started.

When to use generative AI

Identify whether generative AI, traditional AI, or a combination of both might suit your business use case.

Develop a generative AI application

Learn how to address the challenges in each stage of developing a generative AI application.

Code samples and sample applications

View code samples for popular use cases and deploy examples of generative AI applications that are secure, efficient, resilient, high-performing, and cost-effective.

Model exploration and hosting

Google Cloud provides a set of state-of-the-art foundation models through Vertex AI, including Gemini. You can also deploy a third-party model to either Vertex AI Model Garden or self-host on GKE or Compute Engine.

Google Models on Vertex AI (Gemini, Imagen)

Discover test, customize, and deploy Google models and assets from an ML model library.

Other models in the Vertex AI Model Garden

Discover, test, customize, and deploy select OSS models and assets from an ML model library.

Text generation models via HuggingFace

Learn how to deploy HuggingFace text generation models to Vertex AI or Google Kubernetes Engine (GKE).

AI/ML orchestration on GKE

GKE efficiently orchestrates AI/ML workloads, supporting GPUs and TPUs for scalable generative AI training and serving.

GPUs on Compute Engine

Attach GPUs to VM instances to accelerate generative AI workloads on Compute Engine.

Prompt design and engineering

Prompt design is the process of authoring prompt and response pairs to give language models additional context and instructions. After you author prompts, you feed them to the model as a prompt dataset for pretraining. When a model serves predictions, it responds with your instructions built in.

Vertex AI Studio

Design, test, and customize your prompts sent to Google's Gemini and PaLM 2 large language models (LLM).

Overview of Prompting Strategies

Learn the prompt-engineering workflow and common strategies that you can use to affect model responses.
View example prompts and responses for specific use cases.

Grounding and RAG

Grounding connects AI models to data sources to improve the accuracy of responses and reduce hallucinations. RAG, a common grounding technique, searches for relevant information and adds it to the model's prompt, ensuring output is based on facts and up-to-date information.

Vertex AI grounding

You can ground Vertex AI models with Google Search or with your own data stored in Vertex AI Search.
Use Grounding with Google Search to connect the model to the up-to-date knowledge available on the internet.

Vector embeddings in AlloyDB

Use AlloyDB to generate and store vector embeddings, then index and query the embeddings using the pgvector extension.

Cloud SQL and pgvector

Store vector embeddings in Postgres SQL, then index and query the embeddings using the pgvector extension.

Integrating BigQuery data into your LangChain application

Use LangChain to extract data from BigQuery and enrich and ground your model's responses.

Vector embeddings in Firestore

Create vector embeddings from your Firestore data, then index and query the embeddings.

Vector embeddings in Memorystore (Redis)

Use LangChain to extract data from Memorystore and enrich and ground your model's responses.

Agents and function calling

Agents make it easy to design and integrate a conversational user interface into your mobile app, while function calling extends the capabilities of a model.

Vertex AI Agent Builder

Leverage Google's foundation models, search expertise, and conversational AI technologies for enterprise-grade generative AI applications.

Vertex AI Function calling

Add function calling to your model to enable actions like booking a reservation based on extracted calendar information.

Model customization and training

Specialized tasks, such as training a language model on specific terminology, might require more training than you can do with prompt design or grounding alone. In that scenario, you can use model tuning to improve performance, or train your own model.

Evaluate models in Vertex AI

Evaluate the performance of foundation models and your tuned generative AI models on Vertex AI.

Tune Vertex AI models

General purpose foundation models can benefit from tuning to improve their performance on specific tasks.

Cloud TPU

TPUs are Google's custom-developed ASICs used to accelerate machine learning workloads, such as training an LLM.

Start building

LangChain is an open source framework for generative AI apps that allows you to build context into your prompts, and take action based on the model's response.
View code samples for popular use cases and deploy examples of generative AI applications that are secure, efficient, resilient, high-performing, and cost-effective.