LangChain on Vertex AI

LangChain on Vertex AI (Preview) lets you leverage the LangChain open source library to build custom Generative AI applications and use Vertex AI for models, tools and deployment. With LangChain on Vertex AI (Preview), you can do the following:

  • Select the large language model (LLM) that you want to work with.
  • Define tools to access external APIs.
  • Structure the interface between the user and the system components in an orchestration framework.
  • Deploy the framework to a managed runtime.

System components

Building and deploying a custom generative AI application using OSS LangChain and Vertex AI consists of four components:


When you submit a query to your custom application, the LLM processes the query and provides a response.

You can choose to define a set of tools that communicates with external APIs and provide them to the model. While processing a query, the model delegates certain tasks to the tools. This implies one or more model calls to foundation or fine-tuned models.

To learn more, see Model versions and lifecycle.


You can choose to define a set of tools that communicates with external APIs (for example, a database) and provide them to the model. While processing a query, the model can delegate certain tasks to the tools.

Deployment through Vertex AI's managed runtime is optimized to use tools based on Gemini Function Calling, but supports LangChain Tool/Function Calling. To learn more about Gemini Function Calling, see Function calling.

Orchestration framework

LangChain on Vertex AI lets you leverage the LangChain orchestration framework in Vertex AI. Use LangChain to decide how deterministic your application should be.

If you already use LangChain, you can use your existing LangChain code to deploy your application on Vertex AI. Otherwise, you can create your own application code and structure it in an orchestration framework that leverages Vertex AI's LangChain templates.

To learn more, see Develop an application.

Managed runtime LangChain on Vertex AI lets you deploy your application to a Reasoning Engine managed runtime. This runtime is a Vertex AI service that has all the benefits of Vertex AI integration: security, privacy, observability, and scalability. You can productionize and scale your application with a simple API call, quickly turning locally-tested prototypes into enterprise-ready deployments. To learn more, see Deploy an application.

There are many different ways to prototype and build custom Generative AI applications that leverage agentic capabilities by layering tools and custom functions on top of models like Gemini. When it's time to move your application to production, you need to consider how to deploy and manage your agent and its underlying components.

With the components of LangChain on Vertex AI, the goal is to help you focus on and customize the aspects of the agent functionality that you care about most, such as custom functions, agent behavior, and model parameters, while Google takes care of deployment, scaling packaging, versions, etc. If you work at a lower level in the stack, you might need to manage more than you want to. If you work at a higher level in the stack, you might not have as much developer control as you'd like.

System flow at runtime

When the user makes a query, the defined agent formats it into a prompt for the LLM. The LLM processes the prompt and determines whether it wants to use any of the tools.

If the LLM chooses to use a tool, it generates a FunctionCall with the name and parameters that the tool should be called with. The agent invokes the tool with the FunctionCall and provides the results from the tool back to the LLM. If the LLM chooses not to use any tools, it generates content that is relayed by the agent back to the user.

The following diagram illustrates the system flow at runtime:

System flow at runtime 

Create and deploy a generative AI application

The workflow for building a generative AI applications is:

Steps Description
1. Set up the environment Set up your Google project and install the latest version of the Vertex AI SDK for Python.
2. Develop an application Develop a LangChain application that can be deployed on Reasoning Engine.
3. Deploy the application Deploy the application on Reasoning Engine.
4. Use the application Query Reasoning Engine for a response.
5. Manage the deployed application Manage and delete applications that you have deployed to Reasoning Engine.
6. (Optional) Customize an application template Customize a template for new applications.

The steps are illustrated by the following diagram:

Create and deploy a generative AI application 


  • Customizable: By utilizing LangChain's standardized interfaces, LangChain on Vertex AI can be adopted to build different kinds of applications. You can customize the logic of your application and incorporate any framework, providing a high degree of flexibility.
  • Simplifies deployment: LangChain on Vertex AI uses the same APIs as LangChain to interact with LLMs and build applications. LangChain on Vertex AI simplifies and speeds up deployment with Vertex AI LLMs since the Reasoning Engine runtime supports single click deployment to generate compliant API based on your library.
  • Integration with Vertex AI ecosystems: Reasoning Engine for LangChain on Vertex AI uses Vertex AI's infrastructure and prebuilt containers to help you deploy of your LLM application. You can use the Vertex AI API to integrate with Gemini models, Function Calling, and Extensions.
  • Secure, private, and scalable: You can use a single SDK call instead of managing the development process on your own. The Reasoning Engine managed runtime frees you from tasks such as application server development, container creation, and configuration of authentication, IAM, and scaling. Vertex AI handles autoscaling, regional expansion, and container vulnerabilities.

Use cases

You can use LangChain on Vertex AI for the following tasks:

  • Extract entities from natural language stories: Extract lists of characters, relationships, things, and places from a story.
    Vertex AI SDK for Python notebook - Structured data extraction using function calling
  • Query and understand SQL databases using natural language: Ask the model to convert questions such as What percentage of orders are returned? into SQL queries and create functions that submit these queries to BigQuery.
    Blog post - Building an AI-powered BigQuery Data Exploration App using Function Calling in Gemini
  • Help customers interact with businesses: Create functions that connect to a business' API, allowing the model to provide accurate answers to queries such as Do you have the Pixel 8 Pro in stock? or Is there a store in Mountain View, CA that I can visit to try it out?
    Vertex AI SDK for Python notebook - Function Calling with the Vertex AI Gemini API & Python SDK
  • Build generative AI applications by connecting to public APIs, such as:
  • Interpret voice commands: Create functions that correspond with in-vehicle tasks. For example, you can create functions that turn on the radio or activate the air conditioning. Send audio files of the user's voice commands to the model, and ask the model to convert the audio into text and identify the function that the user wants to call.
  • Automate workflows based on environmental triggers: Create functions to represent processes that can be automated. Provide the model with data from environmental sensors and ask it to parse and process the data to determine whether one or more of the workflows should be activated. For example, a model could process temperature data in a warehouse and choose to activate a sprinkler function.
  • Automate the assignment of support tickets: Provide the model with support tickets, logs, and context-aware rules. Ask the model to process all of this information to determine who the ticket should be assigned to. Call a function to assign the ticket to the person suggested by the model.
  • Retrieve information from a knowledge base: Create functions that retrieve academic articles on a given subject and summarize them. Enable the model to answer questions about academic subjects and provide citations for its answers.

What's next