Bagian berikut memberikan contoh penyesuaian model BERT untuk klasifikasi urutan menggunakan library Hugging Face transformers dengan TensorFlow. Set data didownload ke volume yang dipasang dan didukung Parallelstore, sehingga pelatihan model dapat langsung membaca data dari volume.
Prasyarat
- Pastikan node Anda memiliki memori yang tersedia minimal 8 GiB.
- Buat PersistentVolumeClaim yang meminta volume yang didukung Parallelstore.
Simpan manifes YAML berikut (parallelstore-csi-job-example.yaml
) untuk Tugas pelatihan model Anda.
apiVersion: batch/v1
kind: Job
metadata:
name: parallelstore-csi-job-example
spec:
template:
metadata:
annotations:
gke-parallelstore/cpu-limit: "0"
gke-parallelstore/memory-limit: "0"
spec:
securityContext:
runAsUser: 1000
runAsGroup: 100
fsGroup: 100
containers:
- name: tensorflow
image: jupyter/tensorflow-notebook@sha256:173f124f638efe870bb2b535e01a76a80a95217e66ed00751058c51c09d6d85d
command: ["bash", "-c"]
args:
- |
pip install transformers datasets
python - <<EOF
from datasets import load_dataset
dataset = load_dataset("glue", "cola", cache_dir='/data')
dataset = dataset["train"]
from transformers import AutoTokenizer
import numpy as np
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
tokenized_data = tokenizer(dataset["sentence"], return_tensors="np", padding=True)
tokenized_data = dict(tokenized_data)
labels = np.array(dataset["label"])
from transformers import TFAutoModelForSequenceClassification
from tensorflow.keras.optimizers import Adam
model = TFAutoModelForSequenceClassification.from_pretrained("bert-base-cased")
model.compile(optimizer=Adam(3e-5))
model.fit(tokenized_data, labels)
EOF
volumeMounts:
- name: parallelstore-volume
mountPath: /data
volumes:
- name: parallelstore-volume
persistentVolumeClaim:
claimName: parallelstore-pvc
restartPolicy: Never
backoffLimit: 1
Terapkan manifes YAML ke cluster.
kubectl apply -f parallelstore-csi-job-example.yaml
Periksa progres pemuatan data dan pelatihan model dengan perintah berikut:
POD_NAME=$(kubectl get pod | grep 'parallelstore-csi-job-example' | awk '{print $1}')
kubectl logs -f $POD_NAME -c tensorflow