Codey for Code Completion (code-gecko
) is the name of the model that supports code completion. It's a
foundation model that generates code based on code being written. Codey for Code Completion
completes code that was recently typed by a user. Codey for Code Completion is supported
by the code generation API. Codey APIs are in the PaLM API family.
To learn more about creating prompts for code completion, see Create prompts for code completion.
To explore this model in the console, see the Codey for Code Completion model card in the Model Garden.
Go to the Model Garden
Use cases
Some common use cases for code completion are:
Write code faster: Use the
code-gecko
model to write code faster by taking advantage of code suggested for you.Minimize bugs in code: Use code suggestions that you know are syntactically correct to avoid errors. Code completion helps you minimize the risk of accidentally introducing bugs that can occur when you write code quickly.
HTTP request
POST https://us-central1-googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/code-gecko:predict
Model versions
To use the latest model version,
specify the model name without a version number, for example code-gecko
.
To use a stable model version,
specify the model version number, for example code-gecko@002
. Each
stable version is available for six months after the release date of the
subsequent stable version.
The following table contains the available stable model versions:
code-gecko model | Release date | Discontinuation date |
---|---|---|
code-gecko@002 | December 6, 2023 | April 9, 2025 |
For more information, see Model versions and lifecycle.
Request body
{
"instances":[
{
"prefix": string,
"suffix": string
}
],
"parameters": {
"temperature": number,
"maxOutputTokens": integer,
"candidateCount": integer,
"stopSequences": [ string ],
"logprobs": integer,
"presencePenalty": float,
"frequencyPenalty": float,
"echo": boolean,
"seed": integer
}
}
The following are the parameters for the code completion model named
code-gecko
. The code-gecko
model is one of the Codey models. You can use
these parameters to help optimize your code completion prompt. For more
information, see Code models
overview and Create prompts for
code completion.
Parameter | Description | Acceptable values |
---|---|---|
(required) |
For code models, prefix represents the beginning of a piece of
meaningful programming code or a natural language prompt that describes code to
be generated. The model attempts to fill in the code in between the
prefix and suffix .
|
A valid text string |
(optional) |
For code completion, suffix represents the end of a piece of
meaningful programming code. The model attempts to fill in the code in between
the prefix and suffix .
|
A valid text string |
|
The temperature is used for sampling during response generation. Temperature controls the degree of
randomness in token selection. Lower temperatures are good for prompts that require a less
open-ended or creative response, while higher temperatures can lead to more diverse or creative
results. A temperature of 0 means that the highest probability tokens are always
selected. In this case, responses for a given prompt are mostly deterministic, but a small amount of
variation is still possible.
|
|
|
Maximum number of tokens that can be generated in the response. A token is
approximately four characters. 100 tokens correspond to roughly 60-80 words.
Specify a lower value for shorter responses and a higher value for potentially longer responses. |
|
(optional) |
The number of response variations to return. For each request, you're charged for the
output tokens of all candidates, but are only charged once for the input tokens.
Specifying multiple candidates is a Preview feature that works with
|
(optional) |
(optional) |
Specifies a list of strings that tells the model to stop generating text if one
of the strings is encountered in the response. If a string appears multiple
times in the response, then the response truncates where it's first encountered.
The strings are case-sensitive.
For example, if the following is the returned response when stopSequences isn't specified:
public
static string reverse(string myString)
Then the returned response with stopSequences set to ["Str",
"reverse"] is:
public static string
|
A list of strings |
(optional) |
Returns the log probabilities of the top candidate tokens at each generation
step. The model's chosen tokens and log probabilities are always returned at
each step, which might not appear in the list of top candidates. Specify the
number of candidates to return by using an integer value in the range of
1 -5 .
|
|
(optional) |
Positive values penalize tokens that repeatedly appear in the generated text, decreasing the
probability of repeating content. Acceptable values are -2.0 —2.0 .
|
|
(optional) |
Positive values penalize tokens that already appear in the generated text, increasing the
probability of generating more diverse content. Acceptable values are
-2.0 —2.0 .
|
|
(optional) |
If true, the prompt is echoed in the generated text. |
|
|
When seed is fixed to a specific value, the model makes a best effort to provide
the same response for repeated requests. Deterministic output isn't guaranteed.
Also, changing the model or parameter settings, such as the temperature, can
cause variations in the response even when you use the same seed value. By
default, a random seed value is used.
This is a preview feature. |
|
Sample request
REST
To test a text prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID. For other fields, see the Request body table.
HTTP method and URL:
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/code-gecko:predict
Request JSON body:
{ "instances": [ { "prefix": "PREFIX", "suffix": "SUFFIX"} ], "parameters": { "temperature": TEMPERATURE, "maxOutputTokens": MAX_OUTPUT_TOKENS, "candidateCount": CANDIDATE_COUNT } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/code-gecko:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/code-gecko:predict" | Select-Object -Expand Content
You should receive a JSON response similar to the sample response.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Response body
{
"predictions": [
{
"content": string,
"citationMetadata": {
"citations": [
{
"startIndex": integer,
"endIndex": integer,
"url": string,
"title": string,
"license": string,
"publicationDate": string
}
]
},
"logprobs": {
"tokenLogProbs": [ float ],
"tokens": [ string ],
"topLogProbs": [ { map<string, float> } ]
},
"safetyAttributes":{
"categories": [ string ],
"blocked": boolean,
"scores": [ float ],
"errors": [ int ]
},
"score": float
}
]
}
Response element | Description |
---|---|
blocked |
A boolean flag associated with a safety attribute that indicates if the model's input
or output was blocked. If blocked is true , then the errors
field in the response contains one or more error codes. If blocked is
false , then the response doesn't include the errors field.
|
categories |
A list the safety attribute category names that are associated with the
generated content. The order of the scores in the scores parameter
matches the order of the categories. For example, the first score in the
scores parameter indicates the likelihood that the response violates
the first category in the categories list.
|
citationMetadata |
An element that contains an array of citations. |
citations |
An array of citations. Each citation contains its metadata. |
content |
The result generated by the model using the input text. |
endIndex |
An integer that specifies where a citation ends in the content .
|
errors |
An array of error codes. The errors response field is included in the response only
when the blocked field in the response is true . For information
about understanding error codes, see
Safety errors.
|
license |
The license associated with a citation. |
publicationDate |
The date a citation was published. Its valid formats are
YYYY , YYYY-MM , and YYYY-MM-DD .
|
score |
A float value that's less than zero. The higher the value for
score , the greater confidence the model has in its response.
|
startIndex |
An integer that specifies where a citation starts in the content .
|
title |
The title of a citation source. Examples of source titles might be that of a news article or a book. |
url |
The URL of a citation source. Examples of a URL source might be a news website or a GitHub repository. |
tokens |
The sampled tokens. |
tokenLogProbs |
The sampled tokens' log probabilities. |
topLogProbs |
The most likely candidate tokens and their log probabilities at each step. |
logprobs |
Results of the `logprobs` parameter. 1-1 mapping to `candidates`. |
Sample response
{
"predictions": [
{
"safetyAttributes": {
"blocked": false,
"categories": [],
"scores": []
},
"content": " reverses a string",
"citationMetadata": {
"citations": []
}
},
"score": -1.1161688566207886
]
}
Stream response from Generative AI models
The parameters are the same for streaming and non-streaming requests to the APIs.
To view sample code requests and responses using the REST API, see Examples using the streaming REST API.
To view sample code requests and responses using the Vertex AI SDK for Python,
see Examples using Vertex AI SDK for Python for streaming.