Halaman ini diterjemahkan oleh Cloud Translation API.

Men-deploy model dengan bobot kustom

Men-deploy model dengan bobot kustom adalah penawaran Pratinjau. Anda dapat menyetel model berdasarkan serangkaian model dasar yang telah ditentukan sebelumnya, dan men-deploy model yang telah disesuaikan di Vertex AI Model Garden. Anda dapat men-deploy model kustom menggunakan impor bobot kustom dengan mengupload artefak model ke bucket Cloud Storage di project Anda, yang merupakan pengalaman sekali klik di Vertex AI.

Model yang didukung

Pratinjau publik Deploy model dengan bobot kustom didukung oleh model dasar berikut:

Nama model	Version
Llama	Llama-2: 7B, 13B Llama-3.1: 8B, 70B Llama-3.2: 1B, 3B Llama-4: Scout-17B, Maverick-17B CodeLlama-13B
Gemma	Gemma-2: 27B Gemma-3: 1B, 4B, 3-12B, 27B Medgemma: 4B, 27B-text
Qwen	Qwen2: 1,5 M Qwen2.5: 0,5 M, 1,5 M, 7 M, 32 M Qwen3: 0,6B, 1,7B, 8B, 32B, Qwen3-Coder-480B-A35B-Instruct
Deepseek	Deepseek-R1 Deepseek-V3
Mistral dan Mixtral	Mistral-7B-v0.1 Mixtral-8x7B-v0.1 Mistral-Nemo-Base-2407
Phi-4	Phi-4-reasoning
OSS OpenAI	gpt-oss: 20B, 120B

Batasan

Bobot kustom tidak mendukung impor model terkuantisasi.

File model

Anda harus menyediakan file model dalam format bobot Hugging Face. Untuk mengetahui informasi selengkapnya tentang format bobot Hugging Face, lihat Menggunakan Model Hugging Face.

Jika file yang diperlukan tidak disediakan, deployment model mungkin gagal.

Tabel ini mencantumkan jenis file model, yang bergantung pada arsitektur model:

Konten file model	Jenis file
Konfigurasi model	`config.json`
Bobot model	`.safetensors` `.bin`
Indeks berat	`*.index.json`
File tokenizer	`tokenizer.model` `tokenizer.json` `tokenizer_config.json`

Lokasi

Anda dapat men-deploy model kustom di semua region dari layanan Model Garden.

Prasyarat

Bagian ini menunjukkan cara men-deploy model kustom Anda.

Sebelum memulai

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vertex AI API.

Enable the API

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vertex AI API.

Enable the API

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

Tutorial ini mengasumsikan bahwa Anda menggunakan Cloud Shell untuk berinteraksi dengan Google Cloud. Jika Anda ingin menggunakan shell lain, bukan Cloud Shell, lakukan konfigurasi tambahan berikut:

Install the Google Cloud CLI.
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
To initialize the gcloud CLI, run the following command:
```
gcloud init
```

Men-deploy model kustom

Bagian ini menunjukkan cara men-deploy model kustom Anda.

Jika Anda menggunakan antarmuka command line (CLI), Python, atau JavaScript, ganti variabel berikut dengan nilai agar contoh kode Anda berfungsi:

REGION: Region Anda. Misalnya, uscentral1.
MODEL_GCS: Model Google Cloud Anda. Contohnya, gs://custom-weights-fishfooding/meta-llama/Llama-3.2-1B-Instruct
PROJECT_ID: Project ID Anda.
MODEL_ID: ID model Anda.
MACHINE_TYPE: Jenis mesin Anda. Contohnya, g2-standard-12.
ACCELERATOR_TYPE: Jenis akselerator Anda. Contohnya, NVIDIA_L4.
ACCELERATOR_COUNT: Jumlah akselerator Anda.
PROMPT: Perintah teks Anda.

Konsol

Langkah-langkah berikut menunjukkan cara menggunakan konsol Google Cloud untuk men-deploy model dengan bobot kustom.

Di konsol Google Cloud , buka halaman Model Garden.

Buka Model Garden
Klik Deploy model dengan bobot kustom. Panel Deploy model with custom weights on Vertex AI akan muncul.
Di bagian Sumber model, lakukan hal berikut:
1. Klik Telusuri, lalu pilih bucket tempat model Anda disimpan, dan klik Pilih.
2. Opsional: Masukkan nama model Anda di kolom Model name.
Di bagian Deployment settings, lakukan hal berikut:
1. Dari kolom Region, pilih region Anda, lalu klik OK.
2. Di kolom Spesifikasi Mesin, pilih spesifikasi mesin Anda, yang digunakan untuk men-deploy model Anda.
3. Opsional: Di kolom Endpoint name, endpoint model Anda akan muncul secara default. Namun, Anda dapat memasukkan nama endpoint yang berbeda di kolom.
Klik Deploy model dengan bobot kustom.

gcloud CLI

Perintah ini menunjukkan cara men-deploy model ke region tertentu.

gcloud ai model-garden models deploy --model=${MODEL_GCS} --region ${REGION}

Perintah ini menunjukkan cara men-deploy model ke region tertentu dengan jenis mesin, jenis akselerator, dan jumlah akseleratornya. Jika ingin memilih konfigurasi mesin tertentu, Anda harus menyetel ketiga kolom.

gcloud ai model-garden models deploy --model=${MODEL_GCS} --machine-type=${MACHINE_TYE} --accelerator-type=${ACCELERATOR_TYPE} --accelerator-count=${ACCELERATOR_COUNT} --region ${REGION}

Python

import vertexai
from google.cloud import aiplatform
from vertexai.preview import model_garden

vertexai.init(project=${PROJECT_ID}, location=${REGION})
custom_model = model_garden.CustomModel(
  gcs_uri=GCS_URI,
)
endpoint = custom_model.deploy(
  machine_type="${MACHINE_TYPE}",
  accelerator_type="${ACCELERATOR_TYPE}",
  accelerator_count="${ACCELERATOR_COUNT}",
  model_display_name="custom-model",
  endpoint_display_name="custom-model-endpoint")

endpoint.predict(instances=[{"prompt": "${PROMPT}"}], use_dedicated_endpoint=True)

Atau, Anda tidak perlu meneruskan parameter ke metode custom_model.deploy().

import vertexai
from google.cloud import aiplatform
from vertexai.preview import model_garden

vertexai.init(project=${PROJECT_ID}, location=${REGION})
custom_model = model_garden.CustomModel(
  gcs_uri=GCS_URI,
)
endpoint = custom_model.deploy()

endpoint.predict(instances=[{"prompt": "${PROMPT}"}], use_dedicated_endpoint=True)

curl


curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  "https://${REGION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${REGION}:deploy" \
  -d '{
    "custom_model": {
    "gcs_uri": "'"${MODEL_GCS}"'"
  },
  "destination": "projects/'"${PROJECT_ID}"'/locations/'"${REGION}"'",
  "model_config": {
     "model_user_id": "'"${MODEL_ID}"'",
  },
}'

Atau, Anda dapat menggunakan API untuk menetapkan jenis mesin secara eksplisit.


curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  "https://${REGION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${REGION}:deploy" \
  -d '{
    "custom_model": {
    "gcs_uri": "'"${MODEL_GCS}"'"
  },
  "destination": "projects/'"${PROJECT_ID}"'/locations/'"${REGION}"'",
  "model_config": {
     "model_user_id": "'"${MODEL_ID}"'",
  },
  "deploy_config": {
    "dedicated_resources": {
      "machine_spec": {
        "machine_type": "'"${MACHINE_TYPE}"'",
        "accelerator_type": "'"${ACCELERATOR_TYPE}"'",
        "accelerator_count": '"${ACCELERATOR_COUNT}"'
      },
      "min_replica_count": 1
    }
  }
}'

Mempelajari lebih lanjut model yang di-deploy sendiri di Vertex AI

Untuk mengetahui informasi selengkapnya tentang model yang di-deploy sendiri, lihat Ringkasan model yang di-deploy sendiri.
Untuk mengetahui informasi selengkapnya tentang Model Garden, lihat Ringkasan Model Garden.
Untuk mengetahui informasi selengkapnya tentang men-deploy model, lihat Menggunakan model di Model Garden.
Menggunakan model terbuka Gemma
Menggunakan model terbuka Llama
Menggunakan model terbuka Hugging Face

Men-deploy model dengan bobot kustom Tetap teratur dengan koleksi Simpan dan kategorikan konten berdasarkan preferensi Anda.

Model yang didukung

Batasan

File model

Lokasi

Prasyarat

Sebelum memulai

Men-deploy model kustom

Konsol

gcloud CLI

Python

curl

Mempelajari lebih lanjut model yang di-deploy sendiri di Vertex AI

Men-deploy model dengan bobot kustom