自 2025 年 4 月 29 日起，Gemini 1.5 Pro 和 Gemini 1.5 Flash 模型將無法用於先前未使用這些模型的專案，包括新專案。詳情請參閱「模型版本和生命週期」。

本頁面由 Cloud Translation API 翻譯而成。

調整開放式模型

本頁面說明如何對 Llama 3.1 等開放模型執行監督式微調。

支援的調整方法

完整微調
低秩調整 (LoRA)：LoRA 是一種參數高效調整方法，只會調整部分參數。與完整微調相比，這項技術的成本效益更高，且需要的訓練資料較少。另一方面，完整微調可調整所有參數，因此品質潛力較高。

支援的模型

meta/llama3_1@llama-3.1-8b
meta/llama3_1@llama-3.1-8b-instruct
meta/llama3-2@llama-3.2-1b-instruct：僅支援完整微調
meta/llama3-2@llama-3.2-3b-instruct：僅支援完整微調
meta/llama3-3@llama-3.3-70b-instruct

事前準備

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vertex AI and Cloud Storage APIs.

Enable the APIs

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vertex AI and Cloud Storage APIs.

Enable the APIs

安裝並初始化 Vertex AI SDK for Python

匯入下列程式庫：

import os
import time
import uuid
import vertexai

vertexai.init(project=PROJECT_ID, location=REGION)

from google.cloud import aiplatform
from vertexai.preview.tuning import sft, SourceModel

準備用於調整的資料集

如要微調模型，必須提供訓練資料集。建議您準備選用的驗證資料集，評估調整後模型的成效。

資料集必須採用下列其中一種支援的 JSON Lines (JSONL) 格式，其中每一行都包含一個微調範例。

回合制即時通訊格式

{"messages": [
  {"content": "You are a chatbot that helps with scientific literature and generates state-of-the-art abstracts from articles.",
    "role": "system"},
  {"content": "Summarize the paper in one paragraph.",
    "role": "user"},
  {"content": " Here is a one paragraph summary of the paper:\n\nThe paper describes PaLM, ...",
    "role": "assistant"}
]}

將 JSONL 檔案上傳至 Cloud Storage。

建立調整工作

你可以從以下位置調整：

支援的基礎模型，例如 Llama 3.1
模型架構與其中一個支援的基本模型相同。這可能是來自 Hugging Face 等存放區的自訂模型檢查點，或是來自 Vertex AI 微調作業的先前微調模型。這樣就能繼續調整先前調整過的模型。

Cloud Console

您可以透過下列方式啟動微調：
- 前往模型資訊卡，按一下「微調」，然後選擇「受管理微調」。
  
  前往 Llama 3.1 模型資訊卡
  
  或
- 前往「Tuning」(調整) 頁面，然後按一下「Create tuned model」(建立調整後的模型)。
  
  前往「微調」
填寫參數，然後按一下「開始調整」。

系統會開始執行微調作業，您可以在「受管理微調」分頁的「微調」頁面中查看。

調整工作完成後，您可以在「詳細資料」分頁中查看調整後模型的相關資訊。

Python 適用的 Vertex AI SDK

將參數值替換成自己的值，然後執行下列程式碼，建立微調工作：

sft_tuning_job = sft.preview_train(
    source_model=SourceModel(
      base_model="meta/llama3_1@llama-3.1-8b",
      # Optional, folder that either a custom model checkpoint or previously tuned model
      custom_base_model="gs://{STORAGE-URI}",
    ),
    tuning_mode="FULL", # FULL or PEFT_ADAPTER
    epochs=3,
    train_dataset="gs://{STORAGE-URI}", # JSONL file
    validation_dataset="gs://{STORAGE-URI}", # JSONL file
    output_uri="gs://{STORAGE-URI}",
)

工作完成後，調整後模型的模型構件會儲存在 <output_uri>/postprocess/node-0/checkpoints/final 資料夾中。

部署經過調整的模型

您可以將調整過的模型部署至 Vertex AI 端點。您也可以從 Cloud Storage 匯出微調模型，並部署至其他位置。

如要將調整過的模型部署至 Vertex AI 端點，請按照下列步驟操作：

Cloud Console

前往 Model Garden 頁面，然後按一下「Deploy model with custom weights」(使用自訂權重部署模型)。

前往 Model Garden
填寫參數，然後按一下「部署」。

Python 適用的 Vertex AI SDK

使用預先建構的容器部署 G2 machine：

from vertexai.preview import model_garden

MODEL_ARTIFACTS_STORAGE_URI = "gs://{STORAGE-URI}/postprocess/node-0/checkpoints/final"

model = model_garden.CustomModel(
    gcs_uri=MODEL_ARTIFACTS_STORAGE_URI,
)

# deploy the model to an endpoint using GPUs. Cost will incur for the deployment
endpoint = model.deploy(
  machine_type="g2-standard-12",
  accelerator_type="NVIDIA_L4",
  accelerator_count=1,
)

取得推論結果

部署成功後，您就可以使用文字提示將要求傳送至端點。請注意，前幾個提示的執行時間會比較長。

# Loads the deployed endpoint
endpoint = aiplatform.Endpoint("projects/{PROJECT_ID}/locations/{REGION}/endpoints/{endpoint_name}")

prompt = "Summarize the following article. Article: Preparing a perfect risotto requires patience and attention to detail. Begin by heating butter in a large, heavy-bottomed pot over medium heat. Add finely chopped onions and minced garlic to the pot, and cook until they're soft and translucent, about 5 minutes. Next, add Arborio rice to the pot and cook, stirring constantly, until the grains are coated with the butter and begin to toast slightly. Pour in a splash of white wine and cook until it's absorbed. From there, gradually add hot chicken or vegetable broth to the rice, stirring frequently, until the risotto is creamy and the rice is tender with a slight bite.. Summary:"

# Define input to the prediction call
instances = [
    {
        "prompt": "What is a car?",
        "max_tokens": 200,
        "temperature": 1.0,
        "top_p": 1.0,
        "top_k": 1,
        "raw_response": True,
    },
]

# Request the prediction
response = endpoint.predict(
    instances=instances
)

for prediction in response.predictions:
    print(prediction)

如要進一步瞭解如何從已部署的模型取得推論結果，請參閱「取得線上推論結果」。

請注意，受管理開放模型會使用 chat.completions 方法，而不是已部署模型使用的 predict 方法。如要進一步瞭解如何從受管理模型取得推論結果，請參閱「呼叫 Llama 模型」。

限制與配額

系統會強制執行並行微調工作數量配額。每個專案都設有預設配額，至少可執行一項微調作業。這是全域配額，適用於所有可用區域和支援的模型。如要同時執行更多工作，請申請更多 Global concurrent managed OSS model fine-tuning jobs per project 配額。

定價

我們會根據模型微調的定價向您收取微調費用。

此外，系統也會針對相關服務 (例如 Cloud Storage 和 Vertex AI Prediction) 向您收費。

請參閱 Vertex AI 的計價方式和 Cloud Storage 的計價方式，然後利用 Pricing Calculator，根據您預估的用量來預估費用。

後續步驟

評估調整後的模型

調整開放式模型 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

支援的調整方法

支援的模型

事前準備

準備用於調整的資料集

回合制即時通訊格式

建立調整工作

Cloud Console

Python 適用的 Vertex AI SDK

部署經過調整的模型

Cloud Console

Python 適用的 Vertex AI SDK

取得推論結果

限制與配額

定價

後續步驟

調整開放式模型