Halaman ini diterjemahkan oleh Cloud Translation API.

Mengorkestrasi workload Multislice menggunakan JobSet dan Kueue

Autopilot Standard

Tutorial ini menunjukkan cara mengatur beberapa workload multislice di Google Kubernetes Engine (GKE) untuk meningkatkan penggunaan resource. Anda men-deploy workload Jax sebagai contoh, menjalankannya di TPU Multislice, dan menerapkan antrean Tugas dengan JobSet dan Kueue. Kueue menentukan kapan Tugas harus dijalankan berdasarkan resource, kuota, dan hierarki yang tersedia untuk pembagian yang adil di antara tim.

Tutorial ini ditujukan untuk engineer Machine Learning (ML) serta admin dan operator Platform yang tertarik dengan kemampuan orkestrasi penampung Kubernetes untuk melatih LLM. Untuk mempelajari lebih lanjut peran umum dan contoh tugas yang kami referensikan dalam konten Google Cloud , lihat Peran dan tugas pengguna GKE Enterprise umum.

Sebelum membaca halaman ini, pastikan Anda memahami hal-hal berikut:

Ketersediaan versi TPU saat ini dengan arsitektur sistem Cloud TPU
TPU Multislice di GKE

Tujuan

Siapkan lingkungan Anda dengan cluster GKE yang memiliki tiga slice TPU v5e. Setiap slice TPU memiliki topologi 2x4 dengan 8 chip. Oleh karena itu, total ada 24 chip TPU v5e.
Buat resource Kueue untuk memastikan kuota dibagikan secara adil di antara workload.
Jalankan workload Multislice Anda.

Sebelum memulai

Sebelum memulai, pastikan Anda telah menjalankan tugas berikut:

Aktifkan Google Kubernetes Engine API.

Aktifkan Google Kubernetes Engine API

Jika ingin menggunakan Google Cloud CLI untuk tugas ini, instal lalu lakukan inisialisasi gcloud CLI. Jika sebelumnya Anda telah menginstal gcloud CLI, dapatkan versi terbaru dengan menjalankan gcloud components update.
Catatan: Untuk penginstalan gcloud CLI yang ada, pastikan untuk menyetel properti compute/region dan compute/zone. Dengan menyetel lokasi default, Anda dapat menghindari error di gcloud CLI yang seperti ini: One of [--zone, --region] must be supplied: Please specify location.

Instal JobSet v0.2.3 atau yang lebih baru.
Instal Kueue v0.4.1 atau yang lebih baru.

Menyiapkan lingkungan

Di konsol Google Cloud, mulai instance Cloud Shell:
Buka Cloud Shell
Tetapkan variabel lingkungan default:
```
gcloud config set project PROJECT_ID
gcloud config set compute/region COMPUTE_REGION
```
Ganti nilai berikut:
- PROJECT_ID: Google Cloud project ID Anda.
- COMPUTE_REGION: region Compute Engine.

Cluster Autopilot yang menjalankan versi 1.29.2-gke.1521000 atau yang lebih baru akan mengaktifkan TPU secara default. TPU di cluster Autopilot dikonfigurasi dalam spesifikasi workload. Untuk informasi selengkapnya, lihat bagian Menentukan beban kerja Multislice dengan JobSets.

Membuat cluster GKE

Di Cloud Shell, buat cluster GKE:

Autopilot

gcloud container clusters create-auto multislice-cluster \
    --location=LOCATION \
    --cluster-version 1.29.2-gke.1521000 \
    --release-channel rapid

Standar

gcloud container clusters create multislice-cluster \
    --location=LOCATION

Ganti LOCATION dengan lokasi tempat Anda ingin membuat cluster. Pastikan memiliki kapasitas untuk jenis mesin ct5lp-hightpu-4t. Pembuatan cluster mungkin memerlukan waktu beberapa menit.

Jika Anda menggunakan mode GKE Autopilot, lanjutkan ke bagian Membuat resource Kueue. Cluster Autopilot yang menjalankan versi 1.29.2-gke.1521000 atau yang lebih baru mengaktifkan TPU secara default.

Membuat tiga node pool slice TPU mode Standar

Buat node pool pertama bernama nodepool1:

gcloud beta container node-pools create nodepool1 \
    --location=LOCATION \
    --cluster=multislice-cluster \
    --node-locations=NODE_LOCATION \
    --machine-type=ct5lp-hightpu-4t \
    --tpu-topology=2x4 \
    --num-nodes=2 \
    --project=PROJECT_ID

Ganti NODE_LOCATION dengan satu atau beberapa zona di region cluster tempat Anda ingin membuat node.

Buat node pool kedua bernama nodepool2:

gcloud beta container node-pools create nodepool2 \
    --location=LOCATION \
    --cluster=multislice-cluster \
    --node-locations=NODE_LOCATION \
    --machine-type=ct5lp-hightpu-4t \
    --tpu-topology=2x4 \
    --num-nodes=2 \
    --project=PROJECT_ID

Buat node pool ketiga bernama nodepool3:

gcloud beta container node-pools create nodepool3 \
    --location=LOCATION \
    --cluster=multislice-cluster \
    --node-locations=NODE_LOCATION \
    --machine-type=ct5lp-hightpu-4t \
    --tpu-topology=2x4 \
    --num-nodes=2 \
    --project=PROJECT_ID

GKE membuat tiga node pool. Setiap kumpulan node adalah slice TPU terpisah.

Membuat resource Kueue

Buat manifes kueue.yaml berikut:

apiVersion: kueue.x-k8s.io/v1beta1
kind: ResourceFlavor
metadata:
  name: "vlp-24"
spec:
  nodeLabels:
    cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
    cloud.google.com/gke-tpu-topology: 2x4
---
apiVersion: kueue.x-k8s.io/v1beta1
kind: ClusterQueue
metadata:
  name: "cluster-queue"
spec:
  namespaceSelector: {}
  queueingStrategy: BestEffortFIFO
  resourceGroups:
  - coveredResources: ["google.com/tpu"]
    flavors:
    - name: "vlp-24"
      resources:
      - name: "google.com/tpu"
        nominalQuota: 24

---
apiVersion: kueue.x-k8s.io/v1beta1
kind: LocalQueue
metadata:
  namespace: default
  name: multislice-queue
spec:
  clusterQueue: cluster-queue

Terapkan manifes kueue.yaml:
```
kubectl apply -f kueue.yaml
```
GKE membuat resource Kueue berikut:

ResourceFlavor: Abstraksi resource dalam cluster. Dalam contoh ini, GKE membuat tiga slice TPU dengan topologi 2x4. Setiap slice TPU memiliki topologi 2x4 dengan 8 chip (total 24 TPU chip).
ClusterQueue: Antrean global yang mengelola beban kerja dan resource cluster.
LocalQueue: Menggabungkan workload yang terkait erat yang biasanya dijalankan oleh satu tenant (pengguna). Setiap LocalQueue mengarah ke ClusterQueue tempat resource dialokasikan untuk menjalankan workload-nya. Workload Kueue adalah abstraksi yang mewakili beban kerja batch, dalam hal ini, setiap beban kerja adalah JobSet.

Menentukan workload Multislice dengan JobSet

Di bagian ini, Anda akan membuat tiga JobSet. Jobset adalah API beban kerja yang memungkinkan Anda mengelola sekelompok Tugas Kubernetes sebagai satu unit. Kasus penggunaan yang paling umum untuk JobSet adalah pelatihan terdistribusi, tetapi Anda juga dapat menggunakannya untuk menjalankan beban kerja batch.

JobSet berikut menjalankan beban kerja Jax yang menghasilkan output jumlah global chip TPU dalam slice, lalu tidur selama 60 detik untuk menyimulasikan beberapa waktu pelatihan model, lalu keluar.

Buat manifes jobsets-multislice.yaml berikut:

Autopilot

apiVersion: jobset.x-k8s.io/v1alpha2
kind: JobSet
metadata:
  name: multislice-1slice
  labels:
    kueue.x-k8s.io/queue-name: multislice-queue
  annotations:
    alpha.jobset.sigs.k8s.io/exclusive-topology: cloud.google.com/gke-nodepool
spec:
  failurePolicy:
    maxRestarts: 4
  replicatedJobs:
    - name: slice
      replicas: 1
      template:
        spec:
          parallelism: 2
          completions: 2
          backoffLimit: 0
          template:
            spec:
              nodeSelector:
                cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
                cloud.google.com/gke-tpu-topology: 2x4
              containers:
              - name: jax-tpu
                image: python:3.8
                ports:
                - containerPort: 8471
                - containerPort: 8080
                command:
                - bash
                - -c
                - |
                  pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html
                  python -c 'import jax; print("Global device count:", jax.device_count())'
                resources:
                  limits:
                    google.com/tpu: 4

---
apiVersion: jobset.x-k8s.io/v1alpha2
kind: JobSet
metadata:
  name: multislice-2slice
  labels:
    kueue.x-k8s.io/queue-name: multislice-queue
  annotations:
    alpha.jobset.sigs.k8s.io/exclusive-topology: cloud.google.com/gke-nodepool
spec:
  failurePolicy:
    maxRestarts: 4
  replicatedJobs:
    - name: slice
      replicas: 2
      template:
        spec:
          parallelism: 2
          completions: 2
          backoffLimit: 0
          template:
            spec:
              nodeSelector:
                cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
                cloud.google.com/gke-tpu-topology: 2x4
              containers:
              - name: jax-tpu
                image: python:3.8
                ports:
                - containerPort: 8471
                - containerPort: 8080
                command:
                - bash
                - -c
                - |
                  pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html
                  python -c 'import jax; print("Global device count:", jax.device_count())'
                  sleep 60
                resources:
                  limits:
                    google.com/tpu: 4
---
apiVersion: jobset.x-k8s.io/v1alpha2
kind: JobSet
metadata:
  name: multislice-3slice
  labels:
    kueue.x-k8s.io/queue-name: multislice-queue
  annotations:
    alpha.jobset.sigs.k8s.io/exclusive-topology: cloud.google.com/gke-nodepool
spec:
  failurePolicy:
    maxRestarts: 4
  replicatedJobs:
    - name: slice
      replicas: 3
      template:
        spec:
          parallelism: 2
          completions: 2
          backoffLimit: 0
          template:
            spec:
              nodeSelector:
                cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
                cloud.google.com/gke-tpu-topology: 2x4
              containers:
              - name: jax-tpu
                image: python:3.8
                ports:
                - containerPort: 8471
                - containerPort: 8080
                command:
                - bash
                - -c
                - |
                  sleep 60
                resources:
                  limits:
                    google.com/tpu: 4

Standar

apiVersion: jobset.x-k8s.io/v1alpha2
kind: JobSet
metadata:
  name: multislice-1slice
  labels:
    kueue.x-k8s.io/queue-name: multislice-queue
  annotations:
    alpha.jobset.sigs.k8s.io/exclusive-topology: cloud.google.com/gke-nodepool
spec:
  failurePolicy:
    maxRestarts: 4
  replicatedJobs:
    - name: slice
      replicas: 1
      template:
        spec:
          parallelism: 2
          completions: 2
          backoffLimit: 0
          template:
            spec:
              hostNetwork: true
              dnsPolicy: ClusterFirstWithHostNet
              nodeSelector:
                cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
                cloud.google.com/gke-tpu-topology: 2x4
              containers:
              - name: jax-tpu
                image: python:3.8
                ports:
                - containerPort: 8471
                - containerPort: 8080
                securityContext:
                  privileged: true
                command:
                - bash
                - -c
                - |
                  pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html
                  python -c 'import jax; print("Global device count:", jax.device_count())'
                resources:
                  limits:
                    google.com/tpu: 4

---
apiVersion: jobset.x-k8s.io/v1alpha2
kind: JobSet
metadata:
  name: multislice-2slice
  labels:
    kueue.x-k8s.io/queue-name: multislice-queue
  annotations:
    alpha.jobset.sigs.k8s.io/exclusive-topology: cloud.google.com/gke-nodepool
spec:
  failurePolicy:
    maxRestarts: 4
  replicatedJobs:
    - name: slice
      replicas: 2
      template:
        spec:
          parallelism: 2
          completions: 2
          backoffLimit: 0
          template:
            spec:
              hostNetwork: true
              dnsPolicy: ClusterFirstWithHostNet
              nodeSelector:
                cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
                cloud.google.com/gke-tpu-topology: 2x4
              containers:
              - name: jax-tpu
                image: python:3.8
                ports:
                - containerPort: 8471
                - containerPort: 8080
                securityContext:
                  privileged: true
                command:
                - bash
                - -c
                - |
                  pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html
                  python -c 'import jax; print("Global device count:", jax.device_count())'
                  sleep 60
                resources:
                  limits:
                    google.com/tpu: 4
---
apiVersion: jobset.x-k8s.io/v1alpha2
kind: JobSet
metadata:
  name: multislice-3slice
  labels:
    kueue.x-k8s.io/queue-name: multislice-queue
  annotations:
    alpha.jobset.sigs.k8s.io/exclusive-topology: cloud.google.com/gke-nodepool
spec:
  failurePolicy:
    maxRestarts: 4
  replicatedJobs:
    - name: slice
      replicas: 3
      template:
        spec:
          parallelism: 2
          completions: 2
          backoffLimit: 0
          template:
            spec:
              hostNetwork: true
              dnsPolicy: ClusterFirstWithHostNet
              nodeSelector:
                cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
                cloud.google.com/gke-tpu-topology: 2x4
              containers:
              - name: jax-tpu
                image: python:3.8
                ports:
                - containerPort: 8471
                - containerPort: 8080
                securityContext:
                  privileged: true
                command:
                - bash
                - -c
                - |
                  sleep 60
                resources:
                  limits:
                    google.com/tpu: 4

Terapkan manifes jobsets-multislice.yaml:

kubectl apply -f jobsets-multislice.yaml

GKE membuat Tugas dengan permintaan resource berikut:

JobSet multislice-1slice membuat satu Tugas yang memerlukan satu slice TPU secara keseluruhan.
JobSet multislice-2slice membuat dua Tugas yang memerlukan total dua slice TPU.
JobSet multislice-3slice membuat tiga Tugas yang memerlukan total tiga slice TPU.

Karena cluster hanya memiliki tiga slice TPU, tidak semua JobSet dapat berjalan sekaligus. Saat Kueue mengantrekan ketiga JobSet multislice-3slice, Tugasnya berjalan sendiri hingga selesai. multislice-1slice dan multislice-2slice menunggu dan berjalan bersama setelahnya.

Memverifikasi bahwa Kueue menerima workload

Periksa workload yang diantrekan di Kueue:

kubectl get workloads

Outputnya mirip dengan hal berikut ini:

NAME                             QUEUE              ADMITTED BY     AGE
jobset-multislice-1slice-2530a   multislice-queue                   3s
jobset-multislice-2slice-ffb02   multislice-queue                   4s
jobset-multislice-3slice-8c695   multislice-queue   cluster-queue   10s

Kueue mengantrekan satu atau beberapa workload, bergantung pada resource TPU yang diperlukan.

Memantau workload

Pantau pod mana yang sedang berjalan:

kubectl get pods

Outputnya mirip dengan hal berikut ini:

NAME                                READY   STATUS      RESTARTS   AGE
multislice-1slice-slice-0-0-pf2ll   1/1     Running     0          1s
multislice-1slice-slice-0-1-55g62   1/1     Running     0          1s
multislice-2slice-slice-0-0-f4hf7   1/1     Running     0          3s
multislice-2slice-slice-0-1-c8kv7   1/1     Running     0          3s
multislice-2slice-slice-1-0-7h46t   1/1     Running     0          3s
multislice-2slice-slice-1-1-lj9hb   1/1     Running     0          3s
multislice-3slice-slice-0-0-wzq9t   0/1     Completed   0          2m31s
multislice-3slice-slice-0-1-zf4dp   0/1     Completed   0          2m30s
multislice-3slice-slice-1-0-hbfn5   0/1     Completed   0          2m31s
multislice-3slice-slice-1-1-45fgl   0/1     Completed   0          2m30s
multislice-3slice-slice-2-0-wjbp4   0/1     Completed   0          2m30s
multislice-3slice-slice-2-1-lwnvs   0/1     Completed   0          2m30s

Pastikan GKE menjadwalkan, membuat, dan menjalankan Pod untuk multislice-3slice terlebih dahulu. Kemudian, GKE menjalankan Pod dari JobSet multislice-1slice dan multislice-2slice.

Mengaktifkan prioritas dan pengambilalihan beban kerja Kueue

Secara opsional, Anda dapat menetapkan prioritas workload Kueue yang menentukan urutan workload yang diantrekan oleh Kueue.

Perbarui ClusterQueue Anda agar memiliki kebijakan prioritas:

apiVersion: kueue.x-k8s.io/v1beta1
kind: ResourceFlavor
metadata:
  name: "vlp-24"
spec:
  nodeLabels:
    cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
    cloud.google.com/gke-tpu-topology: 2x4
---
apiVersion: kueue.x-k8s.io/v1beta1
kind: ClusterQueue
metadata:
  name: "cluster-queue"
spec:
  namespaceSelector: {}
  resourceGroups:
  - coveredResources: ["google.com/tpu"]
    flavors:
    - name: "vlp-24"
      resources:
      - name: "google.com/tpu"
        nominalQuota: 24
 preemption:
    reclaimWithinCohort: Any
    withinClusterQueue: LowerPriority
---
apiVersion: kueue.x-k8s.io/v1beta1
kind: LocalQueue
metadata:
  namespace: default
  name: multislice-queue
spec:
  clusterQueue: cluster-queue

Buat PriorityClass untuk setiap tingkat prioritas yang berbeda yang ingin Anda tetapkan ke workload:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: low-priority
value: 100
globalDefault: false
description: "This low priority class should be used for some Pods only."

Tetapkan priorityClassName ke JobSet Anda:

Autopilot

apiVersion: jobset.x-k8s.io/v1alpha2
kind: JobSet
metadata:
  name: low-priority
  labels:
    kueue.x-k8s.io/queue-name: multislice-queue
  annotations:
    alpha.jobset.sigs.k8s.io/exclusive-topology: cloud.google.com/gke-nodepool
spec:
  failurePolicy:
    maxRestarts: 4
  replicatedJobs:
    - name: slice
      replicas: 1
      template:
        spec:
          parallelism: 2
          completions: 2
          backoffLimit: 0
          template:
            spec:
              nodeSelector:
                cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
                cloud.google.com/gke-tpu-topology: 2x4
              priorityClassName: low-priority
              containers:
              - name: jax-tpu
                image: python:3.8
                ports:
                - containerPort: 8471
                - containerPort: 8080
                command:
                - bash
                - -c
                - |
                  sleep 60
                resources:
                  limits:
                    google.com/tpu: 4 # Number of TPU chips per worker

Standar

apiVersion: jobset.x-k8s.io/v1alpha2
kind: JobSet
metadata:
  name: low-priority
  labels:
    kueue.x-k8s.io/queue-name: multislice-queue
  annotations:
    alpha.jobset.sigs.k8s.io/exclusive-topology: cloud.google.com/gke-nodepool
spec:
  failurePolicy:
    maxRestarts: 4
  replicatedJobs:
    - name: slice
      replicas: 1
      template:
        spec:
          parallelism: 2
          completions: 2
          backoffLimit: 0
          template:
            spec:
              hostNetwork: true
              dnsPolicy: ClusterFirstWithHostNet
              nodeSelector:
                cloud.google.com/gke-tpu-accelerator: tpu-v5-lite-podslice
                cloud.google.com/gke-tpu-topology: 2x4
              priorityClassName: low-priority
              containers:
              - name: jax-tpu
                image: python:3.8
                ports:
                - containerPort: 8471
                - containerPort: 8080
                securityContext:
                  privileged: true
                command:
                - bash
                - -c
                - |
                  sleep 60
                resources:
                  limits:
                    google.com/tpu: 4 # Number of TPU chips per worker
  ```

Pembersihan

Agar tidak perlu membayar biaya pada akun Google Cloud Anda untuk resource yang digunakan dalam tutorial ini, hapus project yang berisi resource tersebut, atau simpan project dan hapus setiap resource.

Menghapus project

In the Google Cloud console, go to the Manage resources page.
Go to Manage resources
In the project list, select the project that you want to delete, and then click Delete.
In the dialog, type the project ID, and then click Shut down to delete the project.

Menghapus resource satu per satu

Hapus sistem kuota Kueue:

kubectl delete -n team-a localqueue
kubectl delete -n team-b localqueue
kubectl delete clusterqueue
kubectl delete clusterqueue
kubectl delete clusterqueue
kubectl delete resourceflavor
kubectl delete resourceflavor
kubectl delete resourceflavor

Hapus manifes Kueue:

VERSION=kueue.x-k8s.io/v1beta1
kubectl delete -f \
    https://github.com/kubernetes-sigs/kueue/releases/download/$VERSION/manifests.yaml

Hapus kluster:

gcloud container clusters delete kueue-cohort --region=COMPUTE_REGION

Langkah berikutnya

Pelajari Kueue lebih lanjut.
Pelajari cara Menerapkan sistem antrean Tugas dengan pembagian kuota antar-namespace di GKE.