Entrena un modelo de TensorFlow con Keras en Google Kubernetes Engine
Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
En la siguiente sección, se proporciona un ejemplo de ajuste fino de un modelo de BERT para la clasificación de secuencias con la biblioteca de transformadores de Hugging Face con TensorFlow. El conjunto de datos se descarga en un volumen montado con copia de seguridad de Parallelstore, lo que permite que el entrenamiento del modelo lea datos directamente del volumen.
Requisitos previos
Asegúrate de que tu nodo tenga al menos 8 GiB de memoria disponibles.
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],["Última actualización: 2025-09-04 (UTC)"],[],[],null,["# Train a TensorFlow model with Keras on Google Kubernetes Engine\n\nThe following section provides an example of\n[fine-tuning a BERT model](https://huggingface.co/docs/transformers/training#train-a-tensorflow-model-with-keras)\nfor sequence classification using the\n[Hugging Face transformers](https://github.com/huggingface/transformers) library\nwith TensorFlow. The dataset is downloaded into a mounted\nParallelstore-backed volume, allowing the model training to directly read data\nfrom the volume.\n\nPrerequisites\n-------------\n\n- Ensure your node has at least 8 GiB of memory available.\n- [Create a PersistentVolumeClaim requesting for a Parallelstore-backed volume](/kubernetes-engine/docs/how-to/persistent-volumes/parallelstore-csi-new-volume#pvc).\n\nSave the following YAML manifest (`parallelstore-csi-job-example.yaml`) for your model training Job. \n\n apiVersion: batch/v1\n kind: Job\n metadata:\n name: parallelstore-csi-job-example\n spec:\n template:\n metadata:\n annotations:\n gke-parallelstore/cpu-limit: \"0\"\n gke-parallelstore/memory-limit: \"0\"\n spec:\n securityContext:\n runAsUser: 1000\n runAsGroup: 100\n fsGroup: 100\n containers:\n - name: tensorflow\n image: jupyter/tensorflow-notebook@sha256:173f124f638efe870bb2b535e01a76a80a95217e66ed00751058c51c09d6d85d\n command: [\"bash\", \"-c\"]\n args:\n - |\n pip install transformers datasets\n python - \u003c\u003cEOF\n from datasets import load_dataset\n dataset = load_dataset(\"glue\", \"cola\", cache_dir='/data')\n dataset = dataset[\"train\"]\n from transformers import AutoTokenizer\n import numpy as np\n tokenizer = AutoTokenizer.from_pretrained(\"bert-base-cased\")\n tokenized_data = tokenizer(dataset[\"sentence\"], return_tensors=\"np\", padding=True)\n tokenized_data = dict(tokenized_data)\n labels = np.array(dataset[\"label\"])\n from transformers import TFAutoModelForSequenceClassification\n from tensorflow.keras.optimizers import Adam\n model = TFAutoModelForSequenceClassification.from_pretrained(\"bert-base-cased\")\n model.compile(optimizer=Adam(3e-5))\n model.fit(tokenized_data, labels)\n EOF\n volumeMounts:\n - name: parallelstore-volume\n mountPath: /data\n volumes:\n - name: parallelstore-volume\n persistentVolumeClaim:\n claimName: parallelstore-pvc\n restartPolicy: Never\n backoffLimit: 1\n\nApply the YAML manifest to the cluster.\n\n`kubectl apply -f parallelstore-csi-job-example.yaml`\n\nCheck your data loading and model training progress with the following command: \n\n POD_NAME=$(kubectl get pod | grep 'parallelstore-csi-job-example' | awk '{print $1}')\n kubectl logs -f $POD_NAME -c tensorflow\n\n| **Note:** The model training takes approximately five minutes to complete."]]