Vertex AI GenAI API

Service: aiplatform.googleapis.com

To call this service, we recommend that you use the Google-provided client libraries. If your application needs to use your own libraries to call this service, use the following information when you make the API requests.

Discovery document

A Discovery Document is a machine-readable specification for describing and consuming REST APIs. It is used to build client libraries, IDE plugins, and other tools that interact with Google APIs. One service may provide multiple discovery documents. This service provides the following discovery documents:

Service endpoint

A service endpoint is a base URL that specifies the network address of an API service. One service might have multiple service endpoints. This service has the following service endpoint and all URIs below are relative to this service endpoint:

REST Resource: v1.media

Methods
upload POST /v1/{parent}/ragFiles:upload
POST /upload/v1/{parent}/ragFiles:upload
Upload a file into a RagCorpus.

REST Resource: v1.projects

Methods
getCacheConfig GET /v1/{name}
Gets a GenAI cache config.
updateCacheConfig PATCH /v1/{cacheConfig.name}
Updates a cache config.

REST Resource: v1.projects.locations

Methods
augmentPrompt POST /v1/{parent}:augmentPrompt
Given an input prompt, it returns augmented prompt from vertex rag store to guide LLM towards generating grounded responses.
corroborateContent POST /v1/{parent}:corroborateContent
Given an input text, it returns a score that evaluates the factuality of the text.
evaluateInstances POST /v1/{location}:evaluateInstances
Evaluates instances based on a given metric.
retrieveContexts POST /v1/{parent}:retrieveContexts
Retrieves relevant contexts for a query.

REST Resource: v1.projects.locations.cachedContents

Methods
create POST /v1/{parent}/cachedContents
Creates cached content, this call will initialize the cached content in the data storage, and users need to pay for the cache data storage.
delete DELETE /v1/{name}
Deletes cached content
get GET /v1/{name}
Gets cached content configurations
list GET /v1/{parent}/cachedContents
Lists cached contents in a project
patch PATCH /v1/{cachedContent.name}
Updates cached content configurations

REST Resource: v1.projects.locations.endpoints

Methods
fetchPredictOperation POST /v1/{endpoint}:fetchPredictOperation
Fetch an asynchronous online prediction operation.
generateContent POST /v1/{model}:generateContent
Generate content with multimodal inputs.
predict POST /v1/{endpoint}:predict
Perform an online prediction.
predictLongRunning POST /v1/{endpoint}:predictLongRunning
serverStreamingPredict POST /v1/{endpoint}:serverStreamingPredict
Perform a server-side streaming online prediction request for Vertex LLM streaming.
streamGenerateContent POST /v1/{model}:streamGenerateContent
Generate content with multimodal inputs with streaming support.

REST Resource: v1.projects.locations.endpoints.chat

Methods
completions POST /v1/{endpoint}/chat/completions
Exposes an OpenAI-compatible endpoint for chat completions.

REST Resource: v1.projects.locations.models

Methods
getIamPolicy POST /v1/{resource}:getIamPolicy
Gets the access control policy for a resource.
setIamPolicy POST /v1/{resource}:setIamPolicy
Sets the access control policy on the specified resource.
testIamPermissions POST /v1/{resource}:testIamPermissions
Returns permissions that a caller has on the specified resource.

REST Resource: v1.projects.locations.operations

Methods
cancel POST /v1/{name}:cancel
Starts asynchronous cancellation on a long-running operation.
delete DELETE /v1/{name}
Deletes a long-running operation.
get GET /v1/{name}
Gets the latest state of a long-running operation.
list GET /v1/{name}/operations
Lists operations that match the specified filter in the request.
wait POST /v1/{name}:wait
Waits until the specified long-running operation is done or reaches at most a specified timeout, returning the latest state.

REST Resource: v1.projects.locations.publishers.models

Methods
fetchPredictOperation POST /v1/{endpoint}:fetchPredictOperation
Fetch an asynchronous online prediction operation.
generateContent POST /v1/{model}:generateContent
Generate content with multimodal inputs.
predict POST /v1/{endpoint}:predict
Perform an online prediction.
predictLongRunning POST /v1/{endpoint}:predictLongRunning
serverStreamingPredict POST /v1/{endpoint}:serverStreamingPredict
Perform a server-side streaming online prediction request for Vertex LLM streaming.
streamGenerateContent POST /v1/{model}:streamGenerateContent
Generate content with multimodal inputs with streaming support.

REST Resource: v1.projects.locations.ragCorpora

Methods
create POST /v1/{parent}/ragCorpora
Creates a RagCorpus.
delete DELETE /v1/{name}
Deletes a RagCorpus.
get GET /v1/{name}
Gets a RagCorpus.
list GET /v1/{parent}/ragCorpora
Lists RagCorpora in a Location.
patch PATCH /v1/{ragCorpus.name}
Updates a RagCorpus.

REST Resource: v1.projects.locations.ragCorpora.ragFiles

Methods
delete DELETE /v1/{name}
Deletes a RagFile.
get GET /v1/{name}
Gets a RagFile.
import POST /v1/{parent}/ragFiles:import
Import files from Google Cloud Storage or Google Drive into a RagCorpus.
list GET /v1/{parent}/ragFiles
Lists RagFiles in a RagCorpus.

REST Resource: v1.projects.locations.reasoningEngines

Methods
create POST /v1/{parent}/reasoningEngines
Creates a reasoning engine.
delete DELETE /v1/{name}
Deletes a reasoning engine.
get GET /v1/{name}
Gets a reasoning engine.
list GET /v1/{parent}/reasoningEngines
Lists reasoning engines in a location.
patch PATCH /v1/{reasoningEngine.name}
Updates a reasoning engine.
query POST /v1/{name}:query
Queries using a reasoning engine.
streamQuery POST /v1/{name}:streamQuery
Streams queries using a reasoning engine.

REST Resource: v1.projects.locations.tuningJobs

Methods
cancel POST /v1/{name}:cancel
Cancels a TuningJob.
create POST /v1/{parent}/tuningJobs
Creates a TuningJob.
get GET /v1/{name}
Gets a TuningJob.
list GET /v1/{parent}/tuningJobs
Lists TuningJobs in a Location.
rebaseTunedModel POST /v1/{parent}/tuningJobs:rebaseTunedModel
Rebase a TunedModel.

REST Resource: v1beta1.media

Methods
upload POST /v1beta1/{parent}/ragFiles:upload
POST /upload/v1beta1/{parent}/ragFiles:upload
Upload a file into a RagCorpus.

REST Resource: v1beta1.projects

Methods
getCacheConfig GET /v1beta1/{name}
Gets a GenAI cache config.
updateCacheConfig PATCH /v1beta1/{cacheConfig.name}
Updates a cache config.

REST Resource: v1beta1.projects.locations

Methods
augmentPrompt POST /v1beta1/{parent}:augmentPrompt
Given an input prompt, it returns augmented prompt from vertex rag store to guide LLM towards generating grounded responses.
corroborateContent POST /v1beta1/{parent}:corroborateContent
Given an input text, it returns a score that evaluates the factuality of the text.
evaluateInstances POST /v1beta1/{location}:evaluateInstances
Evaluates instances based on a given metric.
retrieveContexts POST /v1beta1/{parent}:retrieveContexts
Retrieves relevant contexts for a query.

REST Resource: v1beta1.projects.locations.cachedContents

Methods
create POST /v1beta1/{parent}/cachedContents
Creates cached content, this call will initialize the cached content in the data storage, and users need to pay for the cache data storage.
delete DELETE /v1beta1/{name}
Deletes cached content
get GET /v1beta1/{name}
Gets cached content configurations
list GET /v1beta1/{parent}/cachedContents
Lists cached contents in a project
patch PATCH /v1beta1/{cachedContent.name}
Updates cached content configurations

REST Resource: v1beta1.projects.locations.endpoints

Methods
countTokens POST /v1beta1/{endpoint}:countTokens
Perform a token counting.
fetchPredictOperation POST /v1beta1/{endpoint}:fetchPredictOperation
Fetch an asynchronous online prediction operation.
generateContent POST /v1beta1/{model}:generateContent
Generate content with multimodal inputs.
getIamPolicy POST /v1beta1/{resource}:getIamPolicy
Gets the access control policy for a resource.
predict POST /v1beta1/{endpoint}:predict
Perform an online prediction.
predictLongRunning POST /v1beta1/{endpoint}:predictLongRunning
serverStreamingPredict POST /v1beta1/{endpoint}:serverStreamingPredict
Perform a server-side streaming online prediction request for Vertex LLM streaming.
setIamPolicy POST /v1beta1/{resource}:setIamPolicy
Sets the access control policy on the specified resource.
streamGenerateContent POST /v1beta1/{model}:streamGenerateContent
Generate content with multimodal inputs with streaming support.
testIamPermissions POST /v1beta1/{resource}:testIamPermissions
Returns permissions that a caller has on the specified resource.

REST Resource: v1beta1.projects.locations.endpoints.chat

Methods
completions POST /v1beta1/{endpoint}/chat/completions
Exposes an OpenAI-compatible endpoint for chat completions.

REST Resource: v1beta1.projects.locations.extensions

Methods
delete DELETE /v1beta1/{name}
Deletes an Extension.
execute POST /v1beta1/{name}:execute
Executes the request against a given extension.
get GET /v1beta1/{name}
Gets an Extension.
import POST /v1beta1/{parent}/extensions:import
Imports an Extension.
list GET /v1beta1/{parent}/extensions
Lists Extensions in a location.
patch PATCH /v1beta1/{extension.name}
Updates an Extension.
query POST /v1beta1/{name}:query
Queries an extension with a default controller.

REST Resource: v1beta1.projects.locations.models

Methods
getIamPolicy POST /v1beta1/{resource}:getIamPolicy
Gets the access control policy for a resource.
setIamPolicy POST /v1beta1/{resource}:setIamPolicy
Sets the access control policy on the specified resource.
testIamPermissions POST /v1beta1/{resource}:testIamPermissions
Returns permissions that a caller has on the specified resource.

REST Resource: v1beta1.projects.locations.operations

Methods
cancel POST /v1beta1/{name}:cancel
Starts asynchronous cancellation on a long-running operation.
delete DELETE /v1beta1/{name}
Deletes a long-running operation.
get GET /v1beta1/{name}
Gets the latest state of a long-running operation.
list GET /v1beta1/{name}/operations
Lists operations that match the specified filter in the request.
wait POST /v1beta1/{name}:wait
Waits until the specified long-running operation is done or reaches at most a specified timeout, returning the latest state.

REST Resource: v1beta1.projects.locations.publishers

Methods
getIamPolicy POST /v1beta1/{resource}:getIamPolicy
Gets the access control policy for a resource.

REST Resource: v1beta1.projects.locations.publishers.models

Methods
countTokens POST /v1beta1/{endpoint}:countTokens
Perform a token counting.
fetchPredictOperation POST /v1beta1/{endpoint}:fetchPredictOperation
Fetch an asynchronous online prediction operation.
generateContent POST /v1beta1/{model}:generateContent
Generate content with multimodal inputs.
getIamPolicy POST /v1beta1/{resource}:getIamPolicy
Gets the access control policy for a resource.
predict POST /v1beta1/{endpoint}:predict
Perform an online prediction.
predictLongRunning POST /v1beta1/{endpoint}:predictLongRunning
serverStreamingPredict POST /v1beta1/{endpoint}:serverStreamingPredict
Perform a server-side streaming online prediction request for Vertex LLM streaming.
streamGenerateContent POST /v1beta1/{model}:streamGenerateContent
Generate content with multimodal inputs with streaming support.

REST Resource: v1beta1.projects.locations.ragCorpora

Methods
create POST /v1beta1/{parent}/ragCorpora
Creates a RagCorpus.
delete DELETE /v1beta1/{name}
Deletes a RagCorpus.
get GET /v1beta1/{name}
Gets a RagCorpus.
list GET /v1beta1/{parent}/ragCorpora
Lists RagCorpora in a Location.
patch PATCH /v1beta1/{ragCorpus.name}
Updates a RagCorpus.

REST Resource: v1beta1.projects.locations.ragCorpora.ragFiles

Methods
delete DELETE /v1beta1/{name}
Deletes a RagFile.
get GET /v1beta1/{name}
Gets a RagFile.
import POST /v1beta1/{parent}/ragFiles:import
Import files from Google Cloud Storage or Google Drive into a RagCorpus.
list GET /v1beta1/{parent}/ragFiles
Lists RagFiles in a RagCorpus.

REST Resource: v1beta1.projects.locations.reasoningEngines

Methods
create POST /v1beta1/{parent}/reasoningEngines
Creates a reasoning engine.
delete DELETE /v1beta1/{name}
Deletes a reasoning engine.
get GET /v1beta1/{name}
Gets a reasoning engine.
list GET /v1beta1/{parent}/reasoningEngines
Lists reasoning engines in a location.
patch PATCH /v1beta1/{reasoningEngine.name}
Updates a reasoning engine.
query POST /v1beta1/{name}:query
Queries using a reasoning engine.
streamQuery POST /v1beta1/{name}:streamQuery
Streams queries using a reasoning engine.

REST Resource: v1beta1.projects.locations.tuningJobs

Methods
cancel POST /v1beta1/{name}:cancel
Cancels a TuningJob.
create POST /v1beta1/{parent}/tuningJobs
Creates a TuningJob.
get GET /v1beta1/{name}
Gets a TuningJob.
list GET /v1beta1/{parent}/tuningJobs
Lists TuningJobs in a Location.
rebaseTunedModel POST /v1beta1/{parent}/tuningJobs:rebaseTunedModel
Rebase a TunedModel.