Halaman ini diterjemahkan oleh Cloud Translation API.

Menyesuaikan model terbuka

Halaman ini menjelaskan cara melakukan penyesuaian terawasi pada model terbuka seperti Llama 3.1.

Mode penyesuaian yang didukung

Penyesuaian penuh
Low-Rank Adaptation (LoRA): LoRA adalah mode penyesuaian efisien parameter yang hanya menyesuaikan subset parameter. Lebih hemat biaya dan memerlukan lebih sedikit data pelatihan daripada penyesuaian penuh. Di sisi lain, penyesuaian menyeluruh memiliki potensi kualitas yang lebih tinggi dengan menyesuaikan semua parameter.

Model yang didukung

Gemma 3 27B IT^** (google/gemma-3-27b-it)
Llama 3.1 8B (meta/llama3_1@llama-3.1-8b)
Llama 3.1 8B Instruct (meta/llama3_1@llama-3.1-8b-instruct)
Llama 3.2 1B Instruct^* (meta/llama3-2@llama-3.2-1b-instruct)
Llama 3.2 3B Instruct^* (meta/llama3-2@llama-3.2-3b-instruct)
Llama 3.3 70B Instruct (meta/llama3-3@llama-3.3-70b-instruct)
Qwen 3 32B^** (qwen/qwen3@qwen3-32b)

^* Hanya mendukung penyesuaian penuh

^** Hanya mendukung penyesuaian yang efisien untuk parameter

Sebelum memulai

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vertex AI and Cloud Storage APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the APIs

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vertex AI and Cloud Storage APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the APIs

Menginstal dan melakukan inisialisasi Vertex AI SDK untuk Python

Impor library berikut:

import os
import time
import uuid
import vertexai

vertexai.init(project=PROJECT_ID, location=REGION)

from google.cloud import aiplatform
from vertexai.preview.tuning import sft, SourceModel

Menyiapkan set data untuk penyesuaian

Set data pelatihan diperlukan untuk penyesuaian. Sebaiknya siapkan set data validasi opsional jika Anda ingin mengevaluasi performa model yang telah disesuaikan.

Set data Anda harus dalam salah satu format JSON Lines (JSONL) yang didukung berikut, dengan setiap baris berisi satu contoh penyesuaian.

Penyelesaian perintah

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

Format chat berbasis giliran

{"messages": [
  {"content": "You are a chatbot that helps with scientific literature and generates state-of-the-art abstracts from articles.",
    "role": "system"},
  {"content": "Summarize the paper in one paragraph.",
    "role": "user"},
  {"content": " Here is a one paragraph summary of the paper:\n\nThe paper describes PaLM, ...",
    "role": "assistant"}
]}

Upload file JSONL Anda ke Cloud Storage.

Buat tugas penyesuaian

Anda dapat menyesuaikan dari:

Model dasar yang didukung, seperti Llama 3.1
Model yang memiliki arsitektur yang sama dengan salah satu model dasar yang didukung. Ini bisa berupa checkpoint model kustom dari repositori seperti Hugging Face atau model yang sebelumnya disesuaikan dari tugas penyesuaian Vertex AI. Cara ini memungkinkan Anda melanjutkan penyesuaian model yang telah disesuaikan.

Cloud Console

Anda dapat memulai penyesuaian halus dengan cara berikut:
- Buka kartu model, klik Sesuaikan, lalu pilih Penyesuaian terkelola.
  
  Buka kartu model Llama 3.1
  
  atau
- Buka halaman Penyesuaian, lalu klik Buat model yang disesuaikan.
  
  Buka Penyesuaian
Isi parameter, lalu klik Mulai penyesuaian.

Tindakan ini akan memulai tugas penyesuaian, yang dapat Anda lihat di halaman Penyesuaian pada tab Penyesuaian terkelola.

Setelah tugas tuning selesai, Anda dapat melihat informasi tentang model yang di-tuning di tab Detail.

Vertex AI SDK untuk Python

Ganti nilai parameter dengan nilai Anda sendiri, lalu jalankan kode berikut untuk membuat tugas penyesuaian:

sft_tuning_job = sft.preview_train(
    source_model=SourceModel(
      base_model="meta/llama3_1@llama-3.1-8b",
      # Optional, folder that either a custom model checkpoint or previously tuned model
      custom_base_model="gs://{STORAGE-URI}",
    ),
    tuning_mode="FULL", # FULL or PEFT_ADAPTER
    epochs=3,
    train_dataset="gs://{STORAGE-URI}", # JSONL file
    validation_dataset="gs://{STORAGE-URI}", # JSONL file
    output_uri="gs://{STORAGE-URI}",
)

Setelah tugas selesai, artefak model untuk model yang disesuaikan akan disimpan di folder <output_uri>/postprocess/node-0/checkpoints/final.

Men-deploy model yang disesuaikan

Anda dapat men-deploy model yang telah disesuaikan ke endpoint Vertex AI. Anda juga dapat mengekspor model yang di-tune dari Cloud Storage dan men-deploy-nya di tempat lain.

Untuk men-deploy model yang disesuaikan ke endpoint Vertex AI:

Cloud Console

Buka halaman Model Garden, lalu klik Deploy model with custom weights.

Buka Model Garden
Isi parameter, lalu klik Deploy.

Vertex AI SDK untuk Python

Men-deploy G2 machine menggunakan container bawaan:

from vertexai.preview import model_garden

MODEL_ARTIFACTS_STORAGE_URI = "gs://{STORAGE-URI}/postprocess/node-0/checkpoints/final"

model = model_garden.CustomModel(
    gcs_uri=MODEL_ARTIFACTS_STORAGE_URI,
)

# deploy the model to an endpoint using GPUs. Cost will incur for the deployment
endpoint = model.deploy(
  machine_type="g2-standard-12",
  accelerator_type="NVIDIA_L4",
  accelerator_count=1,
)

Mendapatkan inferensi

Setelah deployment berhasil, Anda dapat mengirim permintaan ke endpoint dengan perintah teks. Perhatikan bahwa beberapa perintah pertama akan memerlukan waktu lebih lama untuk dieksekusi.

# Loads the deployed endpoint
endpoint = aiplatform.Endpoint("projects/{PROJECT_ID}/locations/{REGION}/endpoints/{endpoint_name}")

prompt = "Summarize the following article. Article: Preparing a perfect risotto requires patience and attention to detail. Begin by heating butter in a large, heavy-bottomed pot over medium heat. Add finely chopped onions and minced garlic to the pot, and cook until they're soft and translucent, about 5 minutes. Next, add Arborio rice to the pot and cook, stirring constantly, until the grains are coated with the butter and begin to toast slightly. Pour in a splash of white wine and cook until it's absorbed. From there, gradually add hot chicken or vegetable broth to the rice, stirring frequently, until the risotto is creamy and the rice is tender with a slight bite.. Summary:"

# Define input to the prediction call
instances = [
    {
        "prompt": "What is a car?",
        "max_tokens": 200,
        "temperature": 1.0,
        "top_p": 1.0,
        "top_k": 1,
        "raw_response": True,
    },
]

# Request the prediction
response = endpoint.predict(
    instances=instances
)

for prediction in response.predictions:
    print(prediction)

Untuk mengetahui detail selengkapnya tentang mendapatkan inferensi dari model yang di-deploy, lihat Mendapatkan inferensi online.

Perhatikan bahwa model terbuka terkelola menggunakan metode chat.completions, bukan metode predict yang digunakan oleh model yang di-deploy. Untuk mengetahui informasi selengkapnya tentang cara mendapatkan inferensi dari model terkelola, lihat Melakukan panggilan ke model Llama.

Batas dan kuota

Kuota diterapkan pada jumlah tugas penyesuaian serentak. Setiap project dilengkapi dengan kuota default untuk menjalankan setidaknya satu tugas penyesuaian. Ini adalah kuota global, yang dibagikan di semua region yang tersedia dan model yang didukung. Jika ingin menjalankan lebih banyak tugas secara bersamaan, Anda harus meminta kuota tambahan untuk Global concurrent managed OSS model fine-tuning jobs per project.

Harga

Anda akan ditagih untuk penyesuaian berdasarkan harga untuk Penyesuaian model.

Anda juga akan ditagih untuk layanan terkait, seperti Cloud Storage dan Prediksi Vertex AI.

Pelajari harga Vertex AI, harga Cloud Storage, dan gunakan Kalkulator Harga untuk memperkirakan biaya berdasarkan proyeksi penggunaan Anda.

Langkah berikutnya

Mengevaluasi model yang disesuaikan