Code completion

Codey for Code Completion (code-gecko) is the name of the model that supports code completion. It's a foundation model that generates code based on code being written. Codey for Code Completion completes code that was recently typed by a user. Codey for Code Completion is supported by the code generation API. Codey APIs are in the PaLM API family.

To learn more about creating prompts for code completion, see Create prompts for code completion.

To explore this model in the console, see the Codey for Code Completion model card in the Model Garden.
Go to the Model Garden

Use cases

Some common use cases for code completion are:

  • Write code faster: Use the code-geckomodel to write code faster by taking advantage of code suggested for you.

  • Minimize bugs in code: Use code suggestions that you know are syntactically correct to avoid errors. Code completion helps you minimize the risk of accidentally introducing bugs that can occur when you write code quickly.

HTTP request

POST https://us-central1-googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/code-gecko:predict

Model versions

To use the latest model version, specify the model name without a version number, for example code-gecko.

To use a stable model version, specify the model version number, for example code-gecko@001. Each stable version is available for six months after the release date of the subsequent stable version.

The following table contains the available stable model versions:

code-gecko model Release date Discontinuation date
code-gecko@002 December 6, 2023 October 9, 2024
code-gecko@001 June 29, 2023 July 6, 2024

For more information, see Model versions and lifecycle.

Request body

{
  "instances":[
    {
      "prefix": string,
      "suffix": string
    }
  ],
  "parameters": {
    "temperature": number,
    "maxOutputTokens": integer,
    "candidateCount": integer,
    "stopSequences": [ string ],
    "logprobs": integer,
    "presencePenalty": float,
    "frequencyPenalty": float,
    "echo": boolean,
    "seed": integer
  }
}

The following are the parameters for the code completion model named code-gecko. The code-gecko model is one of the Codey models. You can use these parameters to help optimize your code completion prompt. For more information, see Code models overview and Create prompts for code completion.

Parameter Description Acceptable values

prefix

(required)

For code models, prefix represents the beginning of a piece of meaningful programming code or a natural language prompt that describes code to be generated. The model attempts to fill in the code in between the prefix and suffix. A valid text string

suffix

(optional)

For code completion, suffix represents the end of a piece of meaningful programming code. The model attempts to fill in the code in between the prefix and suffix. A valid text string

temperature

The temperature is used for sampling during response generation. Temperature controls the degree of randomness in token selection. Lower temperatures are good for prompts that require a less open-ended or creative response, while higher temperatures can lead to more diverse or creative results. A temperature of 0 means that the highest probability tokens are always selected. In this case, responses for a given prompt are mostly deterministic, but a small amount of variation is still possible.

0.0–1.0

Default: 0.2

maxOutputTokens

Maximum number of tokens that can be generated in the response. A token is approximately four characters. 100 tokens correspond to roughly 60-80 words.

Specify a lower value for shorter responses and a higher value for potentially longer responses.

1-64

Default: 64

candidateCount

(optional)

The number of response variations to return.

1-4

Default: 1

(optional)

stopSequences

(optional)

Specifies a list of strings that tells the model to stop generating text if one of the strings is encountered in the response. If a string appears multiple times in the response, then the response truncates where it's first encountered. The strings are case-sensitive.

For example, if the following is the returned response when stopSequences isn't specified:

public static string reverse(string myString)

Then the returned response with stopSequences set to ["Str", "reverse"] is:

public static string
A list of strings

logprobs

(optional)

Returns the top logprobs most likely candidate tokens with their log probabilities at each generation step. The chosen tokens and their log probabilities at each step are always returned. The chosen token may or may not be in the top logprobs most likely candidates.

0-5

frequencyPenalty

(optional)

Positive values penalize tokens that repeatedly appear in the generated text, decreasing the probability of repeating content. Acceptable values are -2.02.0.

Minimum value: -2.0 Maximum value: 2.0

presencePenalty

(optional)

Positive values penalize tokens that already appear in the generated text, increasing the probability of generating more diverse content. Acceptable values are -2.02.0.

Minimum value: -2.0 Maximum value: 2.0

echo

(optional)

If true, the prompt is echoed in the generated text.

Optional

seed

Decoder generates random noise with a pseudo random number generator, temperature * noise is added to logits before sampling. The pseudo random number generator (prng) takes a seed as input, it generates the same output with the same seed.

If seed is not set, the seed used in decoder will not be deterministic, thus the generated random noise will not be deterministic. If seed is set, the generated random noise will be deterministic.

Optional

Sample request

REST

To test a text prompt by using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • For other fields, see the Request body table.

    HTTP method and URL:

    POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/code-gecko:predict

    Request JSON body:

    {
      "instances": [
        { "prefix": "PREFIX",
          "suffix": "SUFFIX"}
      ],
      "parameters": {
        "temperature": TEMPERATURE,
        "maxOutputTokens": MAX_OUTPUT_TOKENS,
        "candidateCount": CANDIDATE_COUNT
      }
    }
    

    To send your request, choose one of these options:

    curl

    Save the request body in a file named request.json, and execute the following command:

    curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/code-gecko:predict"

    PowerShell

    Save the request body in a file named request.json, and execute the following command:

    $cred = gcloud auth print-access-token
    $headers = @{ "Authorization" = "Bearer $cred" }

    Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/code-gecko:predict" | Select-Object -Expand Content

    You should receive a JSON response similar to the sample response.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.

from vertexai.language_models import CodeGenerationModel

# TODO developer - override these parameters as needed:
parameters = {
    "temperature": temperature,  # Temperature controls the degree of randomness in token selection.
    "max_output_tokens": 64,  # Token limit determines the maximum amount of text output.
}

code_completion_model = CodeGenerationModel.from_pretrained("code-gecko@001")
response = code_completion_model.predict(
    prefix="def reverse_string(s):", **parameters
)

print(f"Response from Model: {response.text}")

Node.js

Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

/**
 * TODO(developer): Uncomment these variables before running the sample.\
 * (Not necessary if passing values as arguments)
 */
// const project = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION';
const aiplatform = require('@google-cloud/aiplatform');

// Imports the Google Cloud Prediction service client
const {PredictionServiceClient} = aiplatform.v1;

// Import the helper module for converting arbitrary protobuf.Value objects.
const {helpers} = aiplatform;

// Specifies the location of the api endpoint
const clientOptions = {
  apiEndpoint: 'us-central1-aiplatform.googleapis.com',
};
const publisher = 'google';
const model = 'code-gecko@001';

// Instantiates a client
const predictionServiceClient = new PredictionServiceClient(clientOptions);

async function callPredict() {
  // Configure the parent resource
  const endpoint = `projects/${project}/locations/${location}/publishers/${publisher}/models/${model}`;

  const prompt = {
    prefix:
      'def reverse_string(s): \
        return s[::-1] \
      #This function',
  };
  const instanceValue = helpers.toValue(prompt);
  const instances = [instanceValue];

  const parameter = {
    temperature: 0.2,
    maxOutputTokens: 64,
  };
  const parameters = helpers.toValue(parameter);

  const request = {
    endpoint,
    instances,
    parameters,
  };

  // Predict request
  const [response] = await predictionServiceClient.predict(request);
  console.log('Get code completion response');
  const predictions = response.predictions;
  console.log('\tPredictions :');
  for (const prediction of predictions) {
    console.log(`\t\tPrediction : ${JSON.stringify(prediction)}`);
  }
}

callPredict();

Java

Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.

To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.


import com.google.cloud.aiplatform.v1.EndpointName;
import com.google.cloud.aiplatform.v1.PredictResponse;
import com.google.cloud.aiplatform.v1.PredictionServiceClient;
import com.google.cloud.aiplatform.v1.PredictionServiceSettings;
import com.google.protobuf.InvalidProtocolBufferException;
import com.google.protobuf.Value;
import com.google.protobuf.util.JsonFormat;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class PredictCodeCompletionCommentSample {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace this variable before running the sample.
    String project = "YOUR_PROJECT_ID";

    // Learn how to create prompts to work with a code model to create code completion suggestions:
    // https://cloud.google.com/vertex-ai/docs/generative-ai/code/code-completion-prompts
    String instance =
        "{ \"prefix\": \""
            + "def reverse_string(s):\n"
            + "  return s[::-1]\n"
            + "#This function"
            + "\"}";
    String parameters = "{\n" + "  \"temperature\": 0.2,\n" + "  \"maxOutputTokens\": 64,\n" + "}";
    String location = "us-central1";
    String publisher = "google";
    String model = "code-gecko@001";

    predictComment(instance, parameters, project, location, publisher, model);
  }

  // Use Codey for Code Completion to complete a code comment
  public static void predictComment(
      String instance,
      String parameters,
      String project,
      String location,
      String publisher,
      String model)
      throws IOException {
    final String endpoint = String.format("%s-aiplatform.googleapis.com:443", location);
    PredictionServiceSettings predictionServiceSettings =
        PredictionServiceSettings.newBuilder().setEndpoint(endpoint).build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (PredictionServiceClient predictionServiceClient =
        PredictionServiceClient.create(predictionServiceSettings)) {
      final EndpointName endpointName =
          EndpointName.ofProjectLocationPublisherModelName(project, location, publisher, model);

      Value instanceValue = stringToValue(instance);
      List<Value> instances = new ArrayList<>();
      instances.add(instanceValue);

      Value parameterValue = stringToValue(parameters);

      PredictResponse predictResponse =
          predictionServiceClient.predict(endpointName, instances, parameterValue);
      System.out.println("Predict Response");
      System.out.println(predictResponse);
    }
  }

  // Convert a Json string to a protobuf.Value
  static Value stringToValue(String value) throws InvalidProtocolBufferException {
    Value.Builder builder = Value.newBuilder();
    JsonFormat.parser().merge(value, builder);
    return builder.build();
  }
}

Response body

{
  "predictions": [
    {
      "content": string,
      "citationMetadata": {
        "citations": [
          {
            "startIndex": integer,
            "endIndex": integer,
            "url": string,
            "title": string,
            "license": string,
            "publicationDate": string
          }
        ]
      },
      "logprobs": {
        "tokenLogProbs": [ float ],
        "tokens": [ string ],
        "topLogProbs": [ { map<string, float> } ]
      },
      "safetyAttributes":{
        "categories": [ string ],
        "blocked": boolean,
        "scores": [ float ],
        "errors": [ int ]
      },
      "score": float
    }
  ]
}
Response element Description
blocked A boolean flag associated with a safety attribute that indicates if the model's input or output was blocked. If blocked is true, then the errors field in the response contains one or more error codes. If blocked is false, then the response doesn't include the errors field.
categories A list the safety attribute category names that are associated with the generated content. The order of the scores in the scores parameter matches the order of the categories. For example, the first score in the scores parameter indicates the likelihood that the response violates the first category in the categories list.
citationMetadata An element that contains an array of citations.
citations An array of citations. Each citation contains its metadata.
content The result generated by the model using the input text.
endIndex An integer that specifies where a citation ends in the content.
errors An array of error codes. The errors response field is included in the response only when the blocked field in the response is true. For information about understanding error codes, see Safety errors.
license The license associated with a citation.
publicationDate The date a citation was published. Its valid formats are YYYY, YYYY-MM, and YYYY-MM-DD.
score A float value that's less than zero. The higher the value for score, the greater confidence the model has in its response.
startIndex An integer that specifies where a citation starts in the content.
title The title of a citation source. Examples of source titles might be that of a news article or a book.
url The URL of a citation source. Examples of a URL source might be a news website or a GitHub repository.
tokens The sampled tokens.
tokenLogProbs The sampled tokens' log probabilities.
topLogProbs The most likely candidate tokens and their log probabilities at each step.
logprobs Results of the `logprobs` parameter. 1-1 mapping to `candidates`.

Sample response

{
  "predictions": [
    {
      "safetyAttributes": {
        "blocked": false,
        "categories": [],
        "scores": []
      },
      "content": " reverses a string",
      "citationMetadata": {
        "citations": []
      }
    },
    "score": -1.1161688566207886
  ]
}

Stream response from Generative AI models

The parameters are the same for streaming and non-streaming requests to the APIs.

To view sample code requests and responses using the REST API, see Examples using the streaming REST API.

To view sample code requests and responses using the Vertex AI SDK for Python, see Examples using Vertex AI SDK for Python for streaming.