LangChain on Vertex AI (Preview) lets you use the LangChain open source library to build custom Generative AI applications and use Vertex AI for models, tools and deployment. With LangChain on Vertex AI (Preview), you can do the following:
- Select the large language model (LLM) that you want to work with.
- Define tools to access external APIs.
- Structure the interface between the user and the system components in an orchestration framework.
- Deploy the framework to a managed runtime.
Benefits
- Customizable: By utilizing LangChain's standardized interfaces, LangChain on Vertex AI can be adopted to build different kinds of applications. You can customize the logic of your application and incorporate any framework, providing a high degree of flexibility.
- Simplifies deployment: LangChain on Vertex AI uses the same APIs as LangChain to interact with LLMs and build applications. LangChain on Vertex AI simplifies and speeds up deployment with Vertex AI LLMs since the Reasoning Engine runtime supports single click deployment to generate compliant API based on your library.
- Integration with Vertex AI ecosystems: Reasoning Engine for LangChain on Vertex AI uses Vertex AI's infrastructure and prebuilt containers to help you deploy of your LLM application. You can use the Vertex AI API to integrate with Gemini models, Function Calling, and Extensions.
- Secure, private, and scalable: You can use a single SDK call instead of managing the development process on your own. The Reasoning Engine managed runtime frees you from tasks such as application server development, container creation, and configuration of authentication, IAM, and scaling. Vertex AI handles autoscaling, regional expansion, and container vulnerabilities.
Use cases
To learn about LangChain on Vertex AI with end-to-end examples, see the following resources:
System components
Building and deploying a custom generative AI application using OSS LangChain and Vertex AI consists of four components:
Component | Description |
---|---|
LLM |
When you submit a query to your custom application, the LLM processes the query and provides a response. You can choose to define a set of tools that communicates with external APIs and provide them to the model. While processing a query, the model delegates certain tasks to the tools. This implies one or more model calls to foundation or fine-tuned models. To learn more, see Model versions and lifecycle. |
Tool |
You can choose to define a set of tools that communicates with external APIs (for example, a database) and provide them to the model. While processing a query, the model can delegate certain tasks to the tools. Deployment through Vertex AI's managed runtime is optimized to use tools based on Gemini Function Calling, but supports LangChain Tool/Function Calling. To learn more about Gemini Function Calling, see Function calling. |
Orchestration framework |
LangChain on Vertex AI lets you use the LangChain orchestration framework in Vertex AI. Use LangChain to decide how deterministic your application should be. If you already use LangChain, you can use your existing LangChain code to deploy your application on Vertex AI. Otherwise, you can create your own application code and structure it in an orchestration framework that leverages Vertex AI's LangChain templates. To learn more, see Develop an application. |
Managed runtime | LangChain on Vertex AI lets you deploy your application to a Reasoning Engine managed runtime. This runtime is a Vertex AI service that has all the benefits of Vertex AI integration: security, privacy, observability, and scalability. You can productionize and scale your application with an API call, quickly turning locally-tested prototypes into enterprise-ready deployments. To learn more, see Deploy an application. |
There are many different ways to prototype and build custom Generative AI applications that use agentic capabilities by layering tools and custom functions on top of models like Gemini. When it's time to move your application to production, you need to consider how to deploy and manage your agent and its underlying components.
With the components of LangChain on Vertex AI, the goal is to help you focus on and customize the aspects of the agent capabilities that you care about most, such as custom functions, agent behavior, and model parameters, while Google takes care of deployment, scaling packaging, and versions. If you work at a lower level in the stack, you might need to manage more than you want to. If you work at a higher level in the stack, you might not have as much developer control as you'd like.
System flow at runtime
When the user makes a query, the defined agent formats it into a prompt for the LLM. The LLM processes the prompt and determines whether it wants to use any of the tools.
If the LLM chooses to use a tool, it generates a FunctionCall
with the name
and parameters that the tool should be called with. The agent invokes the tool
with the FunctionCall
and provides the results from the tool back to the LLM.
If the LLM chooses not to use any tools, it generates content that is
relayed by the agent back to the user.
The following diagram illustrates the system flow at runtime:
Create and deploy a generative AI application
The workflow for building a generative AI applications is:
Steps | Description |
---|---|
1. Set up the environment | Set up your Google project and install the latest version of the Vertex AI SDK for Python. |
2. Develop an application | Develop a LangChain application that can be deployed on Reasoning Engine. |
3. Deploy the application | Deploy the application on Reasoning Engine. |
4. Use the application | Query Reasoning Engine for a response. |
5. Manage the deployed application | Manage and delete applications that you have deployed to Reasoning Engine. |
6. (Optional) Customize an application template | Customize a template for new applications. |
The steps are illustrated by the following diagram: