Mantieni tutto organizzato con le raccolte
Salva e classifica i contenuti in base alle tue preferenze.
Per creare un modello personalizzato, devi avere uno script di addestramento Python che crei e addestri il modello personalizzato. Inizializzi il job di addestramento con lo script di addestramento Python, quindi richiami il metodo run del job di addestramento per eseguire lo script.
In questo argomento, crei lo script di addestramento, quindi specifichi gli argomenti del comando per lo script di addestramento.
Crea uno script di addestramento
In questa sezione viene creato uno script di addestramento. Questo script è un nuovo file nell'ambiente del tuo notebook denominato task.py. Più avanti in questo tutorial, passerai questo script al costruttore aiplatform.CustomTrainingJob. Quando viene eseguito, lo script esegue le seguenti operazioni:
Carica i dati nel set di dati BigQuery che hai creato.
Specifica il numero di epoche e la dimensione del batch da utilizzare quando viene invocato il metodo Keras
Model.fit.
Specifica dove salvare gli elementi del modello utilizzando la variabile di ambiente AIP_MODEL_DIR. AIP_MODEL_DIR viene impostato da Vertex AI e contiene l'URI di una directory per il salvataggio degli artefatti del modello. Per ulteriori informazioni, consulta Variabili di ambiente per le directory Cloud Storage speciali.
Esporta un SavedModel di TensorFlow nella directory del modello. Per ulteriori informazioni, consulta la sezione Utilizzare il formato SavedModel sul sito web di TensorFlow.
Per creare lo script di addestramento, esegui il seguente codice nel tuo notebook:
%%writefiletask.pyimportargparseimportnumpyasnpimportosimportpandasaspdimporttensorflowastffromgoogle.cloudimportbigqueryfromgoogle.cloudimportstorage# Read environmental variablestraining_data_uri=os.getenv("AIP_TRAINING_DATA_URI")validation_data_uri=os.getenv("AIP_VALIDATION_DATA_URI")test_data_uri=os.getenv("AIP_TEST_DATA_URI")# Read argsparser=argparse.ArgumentParser()parser.add_argument('--label_column',required=True,type=str)parser.add_argument('--epochs',default=10,type=int)parser.add_argument('--batch_size',default=10,type=int)args=parser.parse_args()# Set up training variablesLABEL_COLUMN=args.label_column# See https://cloud.google.com/vertex-ai/docs/workbench/managed/executor#explicit-project-selection for issues regarding permissions.PROJECT_NUMBER=os.environ["CLOUD_ML_PROJECT_ID"]bq_client=bigquery.Client(project=PROJECT_NUMBER)# Download a tabledefdownload_table(bq_table_uri:str):# Remove bq:// prefix if presentprefix="bq://"ifbq_table_uri.startswith(prefix):bq_table_uri=bq_table_uri[len(prefix):]# Download the BigQuery table as a dataframe# This requires the "BigQuery Read Session User" role on the custom training service account.table=bq_client.get_table(bq_table_uri)returnbq_client.list_rows(table).to_dataframe()# Download dataset splitsdf_train=download_table(training_data_uri)df_validation=download_table(validation_data_uri)df_test=download_table(test_data_uri)defconvert_dataframe_to_dataset(df_train:pd.DataFrame,df_validation:pd.DataFrame,):df_train_x,df_train_y=df_train,df_train.pop(LABEL_COLUMN)df_validation_x,df_validation_y=df_validation,df_validation.pop(LABEL_COLUMN)y_train=tf.convert_to_tensor(np.asarray(df_train_y).astype("float32"))y_validation=tf.convert_to_tensor(np.asarray(df_validation_y).astype("float32"))# Convert to numpy representationx_train=tf.convert_to_tensor(np.asarray(df_train_x).astype("float32"))x_test=tf.convert_to_tensor(np.asarray(df_validation_x).astype("float32"))# Convert to one-hot representationnum_species=len(df_train_y.unique())y_train=tf.keras.utils.to_categorical(y_train,num_classes=num_species)y_validation=tf.keras.utils.to_categorical(y_validation,num_classes=num_species)dataset_train=tf.data.Dataset.from_tensor_slices((x_train,y_train))dataset_validation=tf.data.Dataset.from_tensor_slices((x_test,y_validation))return(dataset_train,dataset_validation)# Create datasetsdataset_train,dataset_validation=convert_dataframe_to_dataset(df_train,df_validation)# Shuffle train setdataset_train=dataset_train.shuffle(len(df_train))defcreate_model(num_features):# Create modelDense=tf.keras.layers.Densemodel=tf.keras.Sequential([Dense(100,activation=tf.nn.relu,kernel_initializer="uniform",input_dim=num_features,),Dense(75,activation=tf.nn.relu),Dense(50,activation=tf.nn.relu),Dense(25,activation=tf.nn.relu),Dense(3,activation=tf.nn.softmax),])# Compile Keras modeloptimizer=tf.keras.optimizers.RMSprop(lr=0.001)model.compile(loss="categorical_crossentropy",metrics=["accuracy"],optimizer=optimizer)returnmodel# Create the modelmodel=create_model(num_features=dataset_train._flat_shapes[0].dims[0].value)# Set up datasetsdataset_train=dataset_train.batch(args.batch_size)dataset_validation=dataset_validation.batch(args.batch_size)# Train the modelmodel.fit(dataset_train,epochs=args.epochs,validation_data=dataset_validation)tf.saved_model.save(model,os.getenv("AIP_MODEL_DIR"))
Una volta creato, lo script viene visualizzato nella cartella principale del tuo notebook:
Definisci gli argomenti per lo script di addestramento
Passa i seguenti argomenti della riga di comando allo script di addestramento:
label_column: identifica la colonna nei dati che contiene ciò che vuoi predire. In questo caso, la colonna è species. Lo hai definito
in una variabile denominata LABEL_COLUMN durante l'elaborazione dei dati. Per ulteriori informazioni, consulta Scaricare, pre-elaborare e suddividere i dati.
epochs: il numero di epoche utilizzate per l'addestramento del modello. Un
periodo è un'iterazione sui dati durante l'addestramento del modello. Questo tutorial
utilizza 20 epoche.
batch_size: il numero di campioni elaborati prima degli aggiornamenti del
modello. Questo tutorial utilizza una dimensione del batch pari a 10.
Per definire gli argomenti passati allo script, esegui il seguente codice:
[[["Facile da capire","easyToUnderstand","thumb-up"],["Il problema è stato risolto","solvedMyProblem","thumb-up"],["Altra","otherUp","thumb-up"]],[["Difficile da capire","hardToUnderstand","thumb-down"],["Informazioni o codice di esempio errati","incorrectInformationOrSampleCode","thumb-down"],["Mancano le informazioni o gli esempi di cui ho bisogno","missingTheInformationSamplesINeed","thumb-down"],["Problema di traduzione","translationIssue","thumb-down"],["Altra","otherDown","thumb-down"]],["Ultimo aggiornamento 2025-09-04 UTC."],[],[],null,["# Create a training script\n\nTo create a custom model, you need a Python training script that creates and trains the custom model. You initialize your training job with the Python training script, then invoke the training job's [`run`](/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.CustomTrainingJob#google_cloud_aiplatform_CustomTrainingJob_run) method to run the script.\n\n\u003cbr /\u003e\n\nIn this topic, you create the training script, then specify command arguments\nfor your training script.\n\nCreate a training script\n------------------------\n\nIn this section, you create a training script. This script is a new file in your\nnotebook environment named `task.py`. Later in this tutorial, you pass this\nscript to the [`aiplatform.CustomTrainingJob`](/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.CustomTrainingJob) constructor. When the script runs, it does the following:\n\n- Loads the data in the BigQuery dataset you created.\n\n- Uses the\n [TensorFlow Keras API](https://www.tensorflow.org/api_docs/python/tf/keras) to\n build, compile, and train your model.\n\n- Specifies the number of epochs and the batch size to use when the Keras\n [`Model.fit`](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit)\n method is invoked.\n\n- Specifies where to save model artifacts using the `AIP_MODEL_DIR` environment\n variable. `AIP_MODEL_DIR` is set by Vertex AI and contains the URI of a\n directory for saving model artifacts. For more information, see [Environment\n variables for special Cloud Storage\n directories](/vertex-ai/docs/training/code-requirements#environment-variables).\n\n- Exports a TensorFlow\n [`SavedModel`](https://www.tensorflow.org/api_docs/python/tf/saved_model) to\n the model directory. For more information, see [Using the `SavedModel`\n format](https://www.tensorflow.org/guide/saved_model#the_savedmodel_format_on_disk)\n on the TensorFlow website.\n\nTo create your training script, run the following code in your notebook: \n\n %%writefile task.py\n\n import argparse\n import numpy as np\n import os\n\n import pandas as pd\n import tensorflow as tf\n\n from google.cloud import bigquery\n from google.cloud import storage\n\n # Read environmental variables\n training_data_uri = os.getenv(\"AIP_TRAINING_DATA_URI\")\n validation_data_uri = os.getenv(\"AIP_VALIDATION_DATA_URI\")\n test_data_uri = os.getenv(\"AIP_TEST_DATA_URI\")\n\n # Read args\n parser = argparse.ArgumentParser()\n parser.add_argument('--label_column', required=True, type=str)\n parser.add_argument('--epochs', default=10, type=int)\n parser.add_argument('--batch_size', default=10, type=int)\n args = parser.parse_args()\n\n # Set up training variables\n LABEL_COLUMN = args.label_column\n\n # See https://cloud.google.com/vertex-ai/docs/workbench/managed/executor#explicit-project-selection for issues regarding permissions.\n PROJECT_NUMBER = os.environ[\"CLOUD_ML_PROJECT_ID\"]\n bq_client = bigquery.Client(project=PROJECT_NUMBER)\n\n\n # Download a table\n def download_table(bq_table_uri: str):\n # Remove bq:// prefix if present\n prefix = \"bq://\"\n if bq_table_uri.startswith(prefix):\n bq_table_uri = bq_table_uri[len(prefix) :]\n\n # Download the BigQuery table as a dataframe\n # This requires the \"BigQuery Read Session User\" role on the custom training service account.\n table = bq_client.get_table(bq_table_uri)\n return bq_client.list_rows(table).to_dataframe()\n\n # Download dataset splits\n df_train = download_table(training_data_uri)\n df_validation = download_table(validation_data_uri)\n df_test = download_table(test_data_uri)\n\n def convert_dataframe_to_dataset(\n df_train: pd.DataFrame,\n df_validation: pd.DataFrame,\n ):\n df_train_x, df_train_y = df_train, df_train.pop(LABEL_COLUMN)\n df_validation_x, df_validation_y = df_validation, df_validation.pop(LABEL_COLUMN)\n\n y_train = tf.convert_to_tensor(np.asarray(df_train_y).astype(\"float32\"))\n y_validation = tf.convert_to_tensor(np.asarray(df_validation_y).astype(\"float32\"))\n\n # Convert to numpy representation\n x_train = tf.convert_to_tensor(np.asarray(df_train_x).astype(\"float32\"))\n x_test = tf.convert_to_tensor(np.asarray(df_validation_x).astype(\"float32\"))\n\n # Convert to one-hot representation\n num_species = len(df_train_y.unique())\n y_train = tf.keras.utils.to_categorical(y_train, num_classes=num_species)\n y_validation = tf.keras.utils.to_categorical(y_validation, num_classes=num_species)\n\n dataset_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))\n dataset_validation = tf.data.Dataset.from_tensor_slices((x_test, y_validation))\n return (dataset_train, dataset_validation)\n\n # Create datasets\n dataset_train, dataset_validation = convert_dataframe_to_dataset(df_train, df_validation)\n\n # Shuffle train set\n dataset_train = dataset_train.shuffle(len(df_train))\n\n def create_model(num_features):\n # Create model\n Dense = tf.keras.layers.Dense\n model = tf.keras.Sequential(\n [\n Dense(\n 100,\n activation=tf.nn.relu,\n kernel_initializer=\"uniform\",\n input_dim=num_features,\n ),\n Dense(75, activation=tf.nn.relu),\n Dense(50, activation=tf.nn.relu),\n Dense(25, activation=tf.nn.relu),\n Dense(3, activation=tf.nn.softmax),\n ]\n )\n\n # Compile Keras model\n optimizer = tf.keras.optimizers.RMSprop(lr=0.001)\n model.compile(\n loss=\"categorical_crossentropy\", metrics=[\"accuracy\"], optimizer=optimizer\n )\n\n return model\n\n # Create the model\n model = create_model(num_features=dataset_train._flat_shapes[0].dims[0].value)\n\n # Set up datasets\n dataset_train = dataset_train.batch(args.batch_size)\n dataset_validation = dataset_validation.batch(args.batch_size)\n\n # Train the model\n model.fit(dataset_train, epochs=args.epochs, validation_data=dataset_validation)\n\n tf.saved_model.save(model, os.getenv(\"AIP_MODEL_DIR\"))\n\nAfter you create the script, it appears in the root folder of your notebook:\n\nDefine arguments for your training script\n-----------------------------------------\n\nYou pass the following command-line arguments to your training script:\n\n- `label_column` - This identifies the column in your data that contains what\n you want to predict. In this case, that column is `species`. You defined this\n in a variable named `LABEL_COLUMN` when you processed your data. For more\n information, see\n [Download, preprocess, and split the data](/vertex-ai/docs/tutorials/tabular-bq-prediction/create-dataset#download-process-public-dataset).\n\n- `epochs` - This is the number of epochs used when you train your model. An\n *epoch* is an iteration over the data when training your model. This tutorial\n uses 20 epochs.\n\n- `batch_size` - This is the number of samples that are processed before your\n model updates. This tutorial uses a batch size of 10.\n\nTo define the arguments that are passed to your script, run the following code: \n\n JOB_NAME = \"custom_job_unique\"\n\n EPOCHS = 20\n BATCH_SIZE = 10\n\n CMDARGS = [\n \"--label_column=\" + LABEL_COLUMN,\n \"--epochs=\" + str(EPOCHS),\n \"--batch_size=\" + str(BATCH_SIZE),\n ]"]]