Last reviewed 2025-09-22 UTC
The following is a list of reference architectures to deploy a generative AI application with retrieval-augmented generation (RAG) in Google Cloud.
Reference architecture | Description |
---|---|
RAG infrastructure for generative AI using Google Agentspace and Vertex AI | An agent-driven architecture that uses Google Agentspace as a unified platform to orchestrate an end-to-end RAG dataflow for enterprise applications that require real-time data availability and enriched contextual search. |
RAG infrastructure for generative AI using Vertex AI and Vector Search. | A fully managed, serverless architecture that provides optimized, high-performance vector search for large-scale applications. |
RAG infrastructure for generative AI using Vertex AI and AlloyDB for PostgreSQL. | A fully managed database architecture that stores vector embeddings alongside your operational data in a fully managed database like Cloud SQL or AlloyDB for PostgreSQL. |
RAG infrastructure for generative AI using GKE and Cloud SQL | A flexible, container-based architecture that provides maximum control to build custom applications with open source tools such as Ray, Hugging Face, and LangChain. |
GraphRAG infrastructure for generative AI using Vertex AI and Spanner Graph | An advanced RAG architecture that combines vector search with knowledge graph queries to retrieve interconnected, contextual data, which results in more detailed and relevant generative AI responses. |