Tune Translation LLM models by using supervised fine-tuning

This guide shows you how to tune a translation-llm-002 model by using supervised fine-tuning. This page covers the following topics:

The following diagram summarizes the overall workflow:

Before you begin

Before you begin, you must prepare a supervised fine-tuning dataset. Depending on your use case, there are different requirements.

Supported models

The following model supports supervised fine-tuning for translation tasks:

  • translation-llm-002 (In preview, only supports text tuning)

Create a tuning job

You can create a supervised fine-tuning job by using the REST API or the Vertex AI SDK for Python.

REST

To create a model tuning job, send a POST request by using the tuningJobs.create method. In your request, include only the parameters that apply to the model you're tuning.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • TUNING_JOB_REGION: The region where the tuning job runs. This is also the default region for where the tuned model is uploaded. Supported region: us-central1.
  • BASE_MODEL: Name of the translation model to tune. Supported values: translation-llm-002.
  • TRAINING_DATASET_URI: Cloud Storage URI of your training dataset. The dataset must be formatted as a JSONL file. For best results, provide at least 100 to 500 examples. For more information, see About supervised tuning dataset .
  • VALIDATION_DATASET_URIOptional: The Cloud Storage URI of your validation dataset file.
  • TUNED_MODEL_DISPLAYNAMEOptional: A display name for the tuned model. If not set, a random name is generated.

HTTP method and URL:

POST https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs

Request JSON body:

{
  "baseModel": "BASE_MODEL",
  "supervisedTuningSpec" : {
      "trainingDatasetUri": "TRAINING_DATASET_URI",
      "validationDatasetUri": "VALIDATION_DATASET_URI",
  },
  "tunedModelDisplayName": "TUNED_MODEL_DISPLAYNAME"
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

Example curl command

PROJECT_ID=myproject
LOCATION=us-central1
curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
"https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/tuningJobs" \
-d \
$'{
   "baseModel": "translation-llm-002",
   "supervisedTuningSpec" : {
      "training_dataset_uri": "gs://cloud-samples-data/ai-platform/generative_ai/gemini-2_0/text/sft_train_data.jsonl",
      "validation_dataset_uri": "gs://cloud-samples-data/ai-platform/generative_ai/gemini-2_0/text/sft_validation_data.jsonl"
   },
   "tunedModelDisplayName": "tuned_translation_llm"
}'

Python

from vertexai.generative_models import GenerativeModel

sft_tuning_job = sft.SupervisedTuningJob("projects/<PROJECT_ID>/locations/<TUNING_JOB_REGION>/tuningJobs/<TUNING_JOB_ID>")
tuned_model = GenerativeModel(sft_tuning_job.tuned_model_endpoint_name)
print(tuned_model.generate_content(content))

import time

import vertexai
from vertexai.tuning import sft

# TODO(developer): Update and un-comment below line.
# PROJECT_ID = os.environ["GOOGLE_CLOUD_PROJECT"]
vertexai.init(project=PROJECT_ID, location="us-central1")

sft_tuning_job = sft.train(
  source_model="translation-llm-002",
    train_dataset="gs://cloud-samples-data/ai-platform/generative_ai/gemini-2_0/text/sft_train_data.jsonl",
    # The following parameters are optional
    validation_dataset="gs://cloud-samples-data/ai-platform/generative_ai/gemini-2_0/text/sft_validation_data.jsonl",
    tuned_model_display_name="tuned_translation_llm_002",
)

# Polling for job completion
while not sft_tuning_job.has_ended:
  time.sleep(60)
  sft_tuning_job.refresh()

print(sft_tuning_job.tuned_model_name)
print(sft_tuning_job.tuned_model_endpoint_name)
print(sft_tuning_job.experiment)
# Example response:
# projects/123456789012/locations/us-central1/models/1234567890@1
# projects/123456789012/locations/us-central1/endpoints/123456789012345
# <google.cloud.aiplatform.metadata.experiment_resources.Experiment object at 0x7b5b4ae07af0>

View a list of tuning jobs

You can view a list of tuning jobs in your current project by using the Google Cloud console, the REST API, or the Vertex AI SDK for Python.

Console

To view your tuning jobs in the Google Cloud console, go to the Vertex AI Studio page.

Go to Vertex AI Studio

Your Translation LLM tuning jobs are listed in the Translation LLM tuned models table.

REST

To view a list of model tuning jobs, send a GET request by using the tuningJobs.list method.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • TUNING_JOB_REGION: The region where the tuning job runs. This is also the default region for where the tuned model is uploaded.

HTTP method and URL:

GET https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

Python

import vertexai
from vertexai.tuning import sft

# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"
vertexai.init(project=PROJECT_ID, location="us-central1")

responses = sft.SupervisedTuningJob.list()

for response in responses:
    print(response)
# Example response:
# <vertexai.tuning._supervised_tuning.SupervisedTuningJob object at 0x7c85287b2680>
# resource name: projects/12345678/locations/us-central1/tuningJobs/123456789012345

Get details of a tuning job

You can get the details of a specific tuning job by using the Google Cloud console, the REST API, or the Vertex AI SDK for Python.

Console

  1. In the Google Cloud console, go to the Vertex AI Studio page.

    Go to Vertex AI Studio

  2. In the Translation LLM tuned models table, click Details for the model that you want to view.

REST

To get details of a model tuning job, send a GET request by using the tuningJobs.get method and specify the TuningJob_ID.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: .
  • TUNING_JOB_REGION: The region where the tuning job runs. This is also the default region for where the tuned model is uploaded.
  • TUNING_JOB_ID: The ID of the tuning job.

HTTP method and URL:

GET https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

Python

import vertexai
from vertexai.tuning import sft

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# LOCATION = "us-central1"
vertexai.init(project=PROJECT_ID, location=LOCATION)

tuning_job_id = "4982013113894174720"
response = sft.SupervisedTuningJob(
    f"projects/{PROJECT_ID}/locations/{LOCATION}/tuningJobs/{tuning_job_id}"
)

print(response)
# Example response:
# <vertexai.tuning._supervised_tuning.SupervisedTuningJob object at 0x7cc4bb20baf0>
# resource name: projects/1234567890/locations/us-central1/tuningJobs/4982013113894174720

Cancel a tuning job

You can cancel a running tuning job by using the Google Cloud console, the REST API, or the Vertex AI SDK for Python.

Console

  1. In the Google Cloud console, go to the Vertex AI Studio page.

    Go to Vertex AI Studio

  2. In the Translation LLM tuned models table, find the job that you want to cancel and click Manage run.

  3. Click Cancel.

REST

To cancel a model tuning job, send a POST request by using the tuningJobs.cancel method and specify the TuningJob_ID.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: .
  • TUNING_JOB_REGION: The region where the tuning job runs. This is also the default region for where the tuned model is uploaded.
  • TUNING_JOB_ID: The ID of the tuning job.

HTTP method and URL:

POST https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID:cancel

To send your request, choose one of these options:

curl

Execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d "" \
"https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID:cancel"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-Uri "https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/tuningJobs/TUNING_JOB_ID:cancel" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

Python

import vertexai
from vertexai.tuning import sft

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# LOCATION = "us-central1"
vertexai.init(project=PROJECT_ID, location=LOCATION)

tuning_job_id = "4982013113894174720"
job = sft.SupervisedTuningJob(
    f"projects/{PROJECT_ID}/locations/{LOCATION}/tuningJobs/{tuning_job_id}"
)
job.cancel()

Test the tuned model with a prompt

After your tuning job is complete, you can test the tuned model by using the REST API or the Vertex AI SDK for Python. The following example prompts a model with the question "Why is the sky blue?".

REST

To test your tuned model, send a POST request to the predict method. In your request, specify the TUNED_ENDPOINT_ID of your model.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: .
  • TUNING_JOB_REGION: The region where the tuning job runs. This is also the default region for where the tuned model is uploaded.
  • ENDPOINT_ID: The tuned model endpoint ID from the GET API.

HTTP method and URL:

POST https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/endpoints/ENDPOINT_ID:generateContent

Request JSON body:

{
    "contents": [
        {
            "role": "USER",
            "parts": {
                "text" : "English: Hello. Spanish:"
            }
        }
    ],
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/endpoints/ENDPOINT_ID:generateContent"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://TUNING_JOB_REGION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/TUNING_JOB_REGION/endpoints/ENDPOINT_ID:generateContent" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

Python

from vertexai.generative_models import GenerativeModel

sft_tuning_job = sft.SupervisedTuningJob("projects/<PROJECT_ID>/locations/<TUNING_JOB_REGION>/tuningJobs/<TUNING_JOB_ID>")
tuned_model = GenerativeModel(sft_tuning_job.tuned_model_endpoint_name)
print(tuned_model.generate_content(content))

Tuning and validation metrics

You can configure a model tuning job to collect and report tuning and evaluation metrics, which you can then visualize in Vertex AI Studio.

To view the metrics visualizations in the Google Cloud console:

  1. Go to the Vertex AI Studio page.

    Go to Vertex AI Studio

  2. In the Translation LLM tuned models table, click the name of the tuned model to view its metrics.

You can find the tuning metrics on the Monitor tab. Visualizations for these metrics are available after the tuning job starts and update in real time. If you don't provide a validation dataset when you create the tuning job, only the tuning metrics are displayed.

Tuning metrics

The model tuning job automatically collects the following tuning metrics for translation-llm-002:

  • /train_total_loss: Loss for the tuning dataset at a training step.
  • /train_fraction_of_correct_next_step_preds: The token accuracy at a training step. A single prediction consists of a sequence of tokens. This metric measures the accuracy of the predicted tokens when compared to the ground truth in the tuning dataset.
  • /train_num_predictions: Number of predicted tokens at a training step.

Validation metrics

You can configure a model tuning job to collect the following validation metrics for translation-llm-002:

  • /eval_total_loss: Loss for the validation dataset at a validation step.
  • /eval_fraction_of_correct_next_step_preds: The token accuracy at a validation step. A single prediction consists of a sequence of tokens. This metric measures the accuracy of the predicted tokens when compared to the ground truth in the validation dataset.
  • /eval_num_predictions: Number of predicted tokens at a validation step.

What's next