Install and configure the Vertex AI SDK for ABAP

This document describes how to install and configure the Vertex AI SDK for ABAP on SAP host system on Compute Engine, any cloud virtual machines, RISE with S/4HANA Cloud Private edition, or on-premise instances.

Installation

When you install the latest version of the on-premises or any cloud edition of ABAP SDK for Google Cloud, the Vertex AI SDK for ABAP is installed for you. For information about the installation steps, see Install and configure the on-premises or any cloud edition of ABAP SDK for Google Cloud.

If you're using version 1.7 or earlier of the on-premises or any cloud edition of ABAP SDK for Google Cloud, then update your SDK to the latest version to get the Vertex AI SDK for ABAP. For more information, see Update ABAP SDK for Google Cloud.

We understand that access to Vertex AI and cloud resources might be limited for some developers. To enable prototyping and experimentation with minimal setup, see Quick prototyping with Gemini.

Enable APIs

To use the Vertex AI SDK for ABAP, you need to enable the necessary APIs in your Google Cloud project.

For information about how to enable Google Cloud APIs, see Enabling APIs.

Vertex AI API

Enable the Vertex AI API in your Google Cloud project.

Vertex AI API

APIs for Gemma models on Cloud Run

If you want to use Gemma models on Cloud Run, then enable the following APIs:

Deploy models in Vertex AI Model Garden

If you want to use an open model or a partner model, then you need to deploy the required model in Vertex AI Model Garden.

For information about how to deploy models in Model Garden, see Use models in Model Garden.

Deploy Gemma models in Vertex AI

In the Google Cloud console, go to the Model Garden page, find a supported Gemma model and deploy the model. For more information, see Deploy an open model.

Deploy partner models in Vertex AI

If you want to use a partner model, then you need to deploy the required model in Vertex AI.

The Vertex AI SDK for ABAP supports the following Anthropic Claude models:

To enable a Claude model, go to the appropriate Model Garden model card, and then click Enable:

Authentication

Once you set up authentication to access Google Cloud APIs in your on-premises or any cloud edition of ABAP SDK for Google Cloud, the Vertex AI SDK for ABAP utilizes the same authentication method to access the Vertex AI API.

The authentication method you use depends on the specific model and platform where you deploy the model. While most Vertex AI services use JWTs or Access Tokens, certain models deployed on specialized platforms require a different credential.

The following table summarizes the required authentication method based on the model and its deployment platform:

Models and services	Required credential	Description
Gemini, Claude, Gemma (Deployed on Vertex AI Model Garden), embeddings, Vector Search, Vertex AI Feature Store	JWT or Access Token	Used for standard IAM authentication with Vertex AI's managed services. For more information, see Authentication overview.
Gemma (Deployed on Cloud Run)	ID Token	Required to authenticate with the deployed Cloud Run endpoint. For information about setting up authentication to invoke Cloud Run, see Authenticate with ID tokens for Cloud Run.
Gemma (Accessed through Gemini API)	API Key	Required when using the Gemini API path for this model. For information about setting up authentication to invoke Cloud Run, see Authenticate with API keys.

Models and services

Required credential

Description

Gemini, Claude, Gemma (Deployed on Vertex AI Model Garden), embeddings, Vector Search, Vertex AI Feature Store

JWT or Access Token

Used for standard IAM authentication with Vertex AI's managed services.

For more information, see Authentication overview.

Gemma (Deployed on Cloud Run)

ID Token

Required to authenticate with the deployed Cloud Run endpoint.

For information about setting up authentication to invoke Cloud Run, see Authenticate with ID tokens for Cloud Run.

Gemma (Accessed through Gemini API)

API Key

Required when using the Gemini API path for this model.

For information about setting up authentication to invoke Cloud Run, see Authenticate with API keys.

For general information about how to set up authentication depending on the environment where you host your SAP system, see Authentication overview.

Note the name of the client key that you create during the authentication setup. You use this client key when you configure AI model generation parameters and search parameters.

IAM permissions

Ensure that the dedicated service account for API access that you've configured in the client key table has access to the Vertex AI resources.

Vertex AI

To use the Vertex AI resources, you must grant the Vertex AI User (roles/aiplatform.user) role to the dedicated service account to which you have granted permissions to access the Vertex AI API.

If you need to provide specific permissions to create, modify, deploy artifacts, then grant specific Vertex AI IAM permissions as appropriate.

Vertex AI Feature Store

To use the Vertex AI Feature Store, you must grant the following roles to the service account:

AI capability	Required IAM roles
Vertex AI Feature Store	Vertex AI Feature Store Admin (`roles/aiplatform.featurestoreAdmin`) Vertex AI Feature Store Data Viewer (`roles/aiplatform.featurestoreDataViewer`)

Gemma models on Cloud Run

To use Gemma models, you must grant the following roles to the service account based on the deployment platform:

Deployment platform	Required IAM roles
Gemma models deployed on Vertex AI Model Garden	Vertex AI User (`roles/aiplatform.user`)
Gemma models on Cloud Run	Cloud Run Invoker (`roles/run.invoker`)

Configure the model generation parameters

Large language models (LLMs) are deep learning models trained on massive amounts of text data. A model includes parameter values that control how the model generates a response. You can get different results from the model by changing the parameter values.

To define the generation parameters for a model, the Vertex AI SDK for ABAP uses the table /GOOG/AI_CONFIG.

To configure the generation parameters for a model, perform the following steps:

In SAP GUI, execute the transaction code /GOOG/SDK_IMG.

Alternatively, execute the transaction code SPRO, and then click SAP Reference IMG.
Click ABAP SDK for Google Cloud > Basic Settings > Vertex AI SDK: Configure Model Generation Parameters.
Click New Entries.

Choose your model family, and enter values as appropriate:

Gemini

Field	Data type	Description
Model Key	String	A unique name that you specify to identify the model configuration, such as `Gemini`. You use this model key while instantiating the generative model class or the embeddings class to specify the generation configuration to take effect.
Model ID	String	Model ID of the LLM, such as `gemini-1.5-flash-001`. For information about Vertex AI model versions, see Model versions and lifecycle.
Google Cloud Key Name	String	The client key that you've configured for authentication to Google Cloud during the authentication setup.
Google Cloud Region Location ID	String	The location ID of the Google Cloud region where the Vertex AI features that you want to use are available. Typically, you use the region closest to your physical location or the physical location of your intended users. For more information, see Vertex AI locations.
Publisher ID of the LLM	String	The publisher of the LLM, such as `google`.
Response MIME type	String	Optional. Output response MIME type of the generated candidate text. Supported MIME type: `text/plain`: (default) Text output. `application/json`: JSON response in the candidates. The model needs to be prompted to output the appropriate response type, otherwise the behavior is undefined.
Randomness temperature	String	Optional. Controls the randomness of predictions. For more information, see Temperature. Range: [0.0, 1.0]
Top-K Sampling	Float	Optional. Top-K changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. For more information, see Top-K. Range: [1, 40]
Top-P Sampling	Float	Optional. Top-P changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. For more information, see Top-P. Range: [0.0, 1.0]
Maximum number of output tokens per msg	Integer	Optional. Maximum number of tokens that can be generated in the response. A token is approximately four characters. 100 tokens correspond to roughly 60-80 words. Specify a lower value for shorter responses and a higher value for potentially longer responses.
Positive Penalties	Float	Optional. Positive values penalize tokens that have appeared in the generated text, thus increasing the possibility of generating more diverse topics. Range: [-2.0, 2.0]
Frequency Penalties	Float	Optional. Positive values penalize tokens that repeatedly appear in the generated text, thus decreasing the possibility of repeating the same content. Range: [-2.0, 2.0]

Claude

Field	Data type	Description
Model Key	String	A unique name that you specify to identify the model configuration, such as `Claude`. You use this model key while instantiating the generative model class to specify the generation configuration to take effect.
Model ID	String	Model ID of the supported Claude model, such as `claude-sonnet-4@20250514`. The following Claude models are supported: Claude Opus 4 Claude Sonnet 4 Claude 3.7 Sonnet
Google Cloud Key Name	String	The client key that you've configured for authentication to Google Cloud during the authentication setup.
Google Cloud Region Location ID	String	The location ID of the Google Cloud region where the Claude model that you want to use is available. Typically, you use the region closest to your physical location or the physical location of your intended users. For more information, see Anthropic Claude quotas and region availability.
Publisher ID of the LLM	String	The publisher of the LLM, such as `anthropic`.
Response MIME type	String	Optional. Output response MIME type of the generated candidate text. Supported MIME type: `text/plain`: (default) Text output. `application/json`: JSON response in the candidates. The model needs to be prompted to output the appropriate response type, otherwise the behavior is undefined.
Randomness temperature	String	Optional. Controls the randomness of predictions. For more information, see Temperature. Range: [0.0, 1.0]
Top-K Sampling	Float	Optional. Top-K changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. For more information, see Top-K. Range: [1, 40]
Top-P Sampling	Float	Optional. Top-P changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. For more information, see Top-P. Range: [0.0, 1.0]
Maximum number of output tokens per msg	Integer	Optional. Maximum number of tokens that can be generated in the response. A token is approximately four characters. 100 tokens correspond to roughly 60-80 words. Specify a lower value for shorter responses and a higher value for potentially longer responses.
Positive Penalties	Float	Not applicable
Frequency Penalties	Float	Not applicable

Gemma

Field	Data type	Description
Model Key	String	A unique name that you specify to identify the model configuration, such as `Gemma`. You use this model key while instantiating the generative model class or the embeddings class to specify the generation configuration to take effect.
Model ID	String	Model ID of the LLM, such as `gemma-3n-e4b-it`. For information about Gemma model versions, see Use Gemma open models.
Google Cloud Key Name	String	The client key that you've configured for authentication to Google Cloud during the authentication setup.
Google Cloud Region Location ID	String	The location ID of the Google Cloud region where the Gemma model is deployed. Typically, you use the region closest to your physical location or the physical location of your intended users.
Publisher ID of the LLM	String	The publisher of the LLM, such as `google`.
Response MIME type	String	Optional. Output response MIME type of the generated candidate text. Supported MIME type: `text/plain`: (default) Text output. `application/json`: JSON response in the candidates. The model needs to be prompted to output the appropriate response type, otherwise the behavior is undefined.
Randomness temperature	String	Optional. Controls the randomness of predictions. For more information, see Temperature. Range: [0.0, 1.0]
Top-K Sampling	Float	Optional. Top-K changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. For more information, see Top-K. Range: [1, 40]
Top-P Sampling	Float	Optional. Top-P changes how the model selects tokens for output. Specify a lower value for less random responses and a higher value for more random responses. For more information, see Top-P. Range: [0.0, 1.0]
Maximum number of output tokens per msg	Integer	Optional. Maximum number of tokens that can be generated in the response. A token is approximately four characters. 100 tokens correspond to roughly 60-80 words. Specify a lower value for shorter responses and a higher value for potentially longer responses.
Positive Penalties	Float	Optional. Positive values penalize tokens that have appeared in the generated text, thus increasing the possibility of generating more diverse topics. Range: [-2.0, 2.0]
Frequency Penalties	Float	Optional. Positive values penalize tokens that repeatedly appear in the generated text, thus decreasing the possibility of repeating the same content. Range: [-2.0, 2.0]

If you don't provide a value for an optional parameter, then the SDK uses the default value of the parameter specific to the model version configured in Model ID.

Save the new entry.

Configure the Vector Search parameters

To define Vector Search configurations, the Vertex AI SDK for ABAP uses the table /GOOG/SEARCHCONF.

To configure the Vector Search parameters, perform the following steps:

In SAP GUI, execute the transaction code /GOOG/SDK_IMG.

Alternatively, execute the transaction code SPRO, and then click SAP Reference IMG.
Click ABAP SDK for Google Cloud > Basic Settings > Vertex AI SDK: Configure Vector Search Parameters.
Click New Entries.

Enter values for the following fields:

Field	Data type	Description
Search Key	String	A unique name that you specify to identify the search configuration.
Google Cloud Key Name	String	The client key that you've configured for authentication to Google Cloud during the authentication setup.
Google Cloud Region Location ID	String	The location ID of the Google Cloud region where the Vertex AI features that you want to use are available. Typically, you use the region closest to your physical location or the physical location of your intended users. For more information, see Vertex AI locations.
Deployment ID of Vector Index	String	The deployment ID of an index. When you deploy an index to an endpoint, you assign it a unique deployment ID. For information about index deployment, see Deploy a vector index to an index endpoint.
Vector Index Endpoint ID	String	The ID of the index endpoint to which the index is deployed to. For information about index endpoint, see Create a vector index endpoint.

Save the new entry.

What's next

Explore the built-in Generative AI demos for SAP.
Learn about Generative AI on Vertex AI for SAP.

Install and configure the Vertex AI SDK for ABAP Stay organized with collections Save and categorize content based on your preferences.

Installation

Enable APIs

Vertex AI API

APIs for Gemma models on Cloud Run

Deploy models in Vertex AI Model Garden

Deploy Gemma models in Vertex AI

Deploy partner models in Vertex AI

Authentication

IAM permissions

Vertex AI

Vertex AI Feature Store

Gemma models on Cloud Run

Configure the model generation parameters

Gemini

Claude

Gemma

Configure the Vector Search parameters

What's next

Install and configure the Vertex AI SDK for ABAP