Google Cloud offers a range of products and tools for the complete life cycle of building generative AI applications.
Access Google's large generative AI models so you can test, tune, and deploy them for use in your AI-powered applications.
See what it's like to send requests to the Gemini API through Google Cloud's AI-ML platform, Vertex AI.
Choose the best products and tools for your use case and access the documentation you need to get started.

Gen AI tools

List of generative AI tools including Vertex AI Studio, Colab Enterprise/Notebooks, and Workbench listed under Cloud Console, and SDK/APIs listed as a separate item.

Gen AI development flow

Diagram of the generative AI development flow with six stages: model selection (including Model Garden), prompt engineering (including prompt gallery, Vertex AI Studio, compare prompts, and optimize prompt), tuning (including training and tuning), optimization (including distillation), deployment (including Model Registry, online prediction, and batch prediction), and monitoring. The model selection, prompt engineering, tuning, and optimization stages are part of a looping subcycle labeled Evaluation.

Model exploration and hosting

Google Cloud provides a set of state-of-the-art foundation models through Vertex AI, including Gemini. You can also deploy a third-party model to either Vertex AI Model Garden or self-host on GKE or Compute Engine.

Google Models on Vertex AI (Gemini, Imagen)

Discover test, customize, and deploy Google models and assets from an ML model library.

Other models in the Vertex AI Model Garden

Discover, test, customize, and deploy select OSS models and assets from an ML model library.

Text generation models via HuggingFace

Learn how to deploy HuggingFace text generation models to Vertex AI or Google Kubernetes Engine (GKE).

Prompt design and engineering

Prompt design is the process of authoring prompt and response pairs to give language models additional context and instructions. After you author prompts, you feed them to the model as a prompt dataset for pretraining. When a model serves predictions, it responds with your instructions built in.

Vertex AI Studio

Design, test, and customize your prompts sent to Google's Gemini and PaLM 2 large language models (LLM).

Overview of Prompting Strategies

Learn the prompt-engineering workflow and common strategies that you can use to affect model responses.
View example prompts and responses for specific use cases.

Grounding and RAG

Grounding connects AI models to data sources to improve the accuracy of responses and reduce hallucinations. RAG, a common grounding technique, searches for relevant information and adds it to the model's prompt, ensuring output is based on facts and up-to-date information.

Vertex AI grounding

You can ground Vertex AI models with Google Search or with your own data stored in Vertex AI Search.
Use Grounding with Google Search to connect the model to the up-to-date knowledge available on the internet.

Vector embeddings in AlloyDB

Use AlloyDB to generate and store vector embeddings, then index and query the embeddings using the pgvector extension.

Agents and function calling

Agents make it easy to design and integrate a conversational user interface into your mobile app, while function calling extends the capabilities of a model.

Vertex AI Agent Builder

Leverage Google's foundation models, search expertise, and conversational AI technologies for enterprise-grade generative AI applications.

Vertex AI Function calling

Add function calling to your model to enable actions like booking a reservation based on extracted calendar information.

Model customization and training

Specialized tasks, such as training a language model on specific terminology, might require more training than you can do with prompt design or grounding alone. In that scenario, you can use model tuning to improve performance, or train your own model.

Evaluate models in Vertex AI

Evaluate the performance of foundation models and your tuned generative AI models on Vertex AI.

Tune Vertex AI models

General purpose foundation models can benefit from tuning to improve their performance on specific tasks.

Cloud TPU

TPUs are Google's custom-developed ASICs used to accelerate machine learning workloads, such as training an LLM.

Start building

LangChain is an open source framework for generative AI apps that allows you to build context into your prompts, and take action based on the model's response.
View code samples for popular use cases and deploy examples of generative AI applications that are secure, efficient, resilient, high-performing, and cost-effective.