Mantieni tutto organizzato con le raccolte
Salva e classifica i contenuti in base alle tue preferenze.
Pre-elaborazione manuale delle funzionalità
Puoi utilizzare la
clausola TRANSFORM
dell'istruzione CREATE MODEL in combinazione con le funzioni di preelaborazione manualmente
per definire la preelaborazione dei dati personalizzata. Puoi anche utilizzare queste funzioni di pre-elaborazione manuale al di fuori della clausola TRANSFORM.
Se vuoi disaccoppiare il pretrattamento dei dati dall'addestramento del modello, puoi creare un
modello solo di trasformazione
che esegue solo le trasformazioni dei dati utilizzando la clausola TRANSFORM.
Puoi utilizzare la
funzione ML.TRANSFORM
per aumentare la trasparenza della pre-elaborazione delle caratteristiche. Questa funzione consente di
restituire i dati pre-elaborati dalla clausola TRANSFORM di un modello, in modo da poter visualizzare i dati di addestramento effettivi che vengono utilizzati per l'addestramento del modello, nonché i dati di previsione effettivi che vengono utilizzati per la pubblicazione del modello.
Esistono diversi tipi di funzioni di pre-elaborazione manuale:
Le funzioni scalari operano su una singola riga. Ad esempio,
ML.BUCKETIZE.
Le funzioni con valori di tabella operano su tutte le righe e producono una tabella. Ad esempio,
ML.FEATURES_AT_TIME.
Le funzioni analitiche operano su tutte le righe e restituiscono il risultato per ogni riga in base alle statistiche raccolte in tutte le righe. Ad esempio,
ML.QUANTILE_BUCKETIZE.
Devi sempre utilizzare una clausola OVER() vuota con le funzioni di analisi ML.
Quando utilizzi le funzioni di analisi ML all'interno della clausolaTRANSFORM
durante l'addestramento, le stesse statistiche vengono applicate automaticamente all'input nella previsione.
Le seguenti sezioni descrivono le funzioni di preelaborazione disponibili.
Funzioni generali
Per la pulizia dei dati, utilizza la seguente funzione su espressioni stringa o numeriche:
[[["Facile da capire","easyToUnderstand","thumb-up"],["Il problema è stato risolto","solvedMyProblem","thumb-up"],["Altra","otherUp","thumb-up"]],[["Difficile da capire","hardToUnderstand","thumb-down"],["Informazioni o codice di esempio errati","incorrectInformationOrSampleCode","thumb-down"],["Mancano le informazioni o gli esempi di cui ho bisogno","missingTheInformationSamplesINeed","thumb-down"],["Problema di traduzione","translationIssue","thumb-down"],["Altra","otherDown","thumb-down"]],["Ultimo aggiornamento 2025-09-04 UTC."],[[["\u003cp\u003eManual feature preprocessing can be defined using custom functions with the \u003ccode\u003eTRANSFORM\u003c/code\u003e clause in the \u003ccode\u003eCREATE MODEL\u003c/code\u003e statement, or independently.\u003c/p\u003e\n"],["\u003cp\u003eTransform-only models can be created using the \u003ccode\u003eTRANSFORM\u003c/code\u003e clause to perform data transformations without training a model.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eML.TRANSFORM\u003c/code\u003e function allows users to inspect preprocessed data from a model's \u003ccode\u003eTRANSFORM\u003c/code\u003e clause for improved transparency.\u003c/p\u003e\n"],["\u003cp\u003eManual preprocessing functions are categorized into scalar, table-valued, and analytic functions, each operating on different scopes of data.\u003c/p\u003e\n"],["\u003cp\u003eThe data cleanup, numerical, categorical, text, and image functions are available for use in manual preprocessing.\u003c/p\u003e\n"]]],[],null,["# Manual feature preprocessing\n============================\n\nYou can use the\n[`TRANSFORM` clause](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create#transform)\nof the `CREATE MODEL` statement in combination with manual preprocessing\nfunctions to define custom data preprocessing. You can\nalso use these manual preprocessing functions outside of the `TRANSFORM` clause.\n\nIf you want to decouple data preprocessing from model training, you can create a\n[transform-only model](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-transform)\nthat only performs data transformations by using the `TRANSFORM` clause.\n\nYou can use the\n[`ML.TRANSFORM` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-transform)\nto increase the transparency of feature preprocessing. This function lets you\nreturn the preprocessed data from a model's `TRANSFORM` clause, so that you can\nsee the actual training data that goes into the model training, as well as the\nactual prediction data that goes into model serving.\n\nFor information about feature preprocessing support in\nBigQuery ML, see\n[Feature preprocessing overview](/bigquery/docs/preprocess-overview).\n\nFor information about the supported SQL statements and functions for each model\ntype, see [End-to-end user journey for each model](/bigquery/docs/e2e-journey).\n\nTypes of preprocessing functions\n--------------------------------\n\nThere are several types of manual preprocessing functions:\n\n- Scalar functions operate on a single row. For example, [`ML.BUCKETIZE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-bucketize).\n- Table-valued functions operate on all rows and output a table. For example, [`ML.FEATURES_AT_TIME`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-feature-time).\n- Analytic functions operate on all rows, and output the result for each\n row based on the statistics collected across all rows. For example,\n [`ML.QUANTILE_BUCKETIZE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-quantile-bucketize).\n\n You must always use an empty `OVER()` clause with ML analytic functions.\n\n When you use ML analytic functions inside the`TRANSFORM` clause\n during training, the same statistics are automatically applied to\n the input in prediction.\n\nThe following sections describe the available preprocessing functions.\n\n### General functions\n\nUse the following function on string or numerical expressions to do data cleanup:\n\n- [`ML.IMPUTER`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-imputer)\n\n### Numerical functions\n\nUse the following functions on numerical expressions to regularize data:\n\n- [`ML.BUCKETIZE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-bucketize)\n- [`ML.MAX_ABS_SCALER`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-max-abs-scaler)\n- [`ML.MIN_MAX_SCALER`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-min-max-scaler)\n- [`ML.NORMALIZER`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-normalizer)\n- [`ML.POLYNOMIAL_EXPAND`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-polynomial-expand)\n- [`ML.QUANTILE_BUCKETIZE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-quantile-bucketize)\n- [`ML.ROBUST_SCALER`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-robust-scaler)\n- [`ML.STANDARD_SCALER`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-standard-scaler)\n\n### Categorical functions\n\nUse the following functions on categorize data:\n\n- [`ML.FEATURE_CROSS`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-feature-cross)\n- [`ML.HASH_BUCKETIZE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-hash-bucketize)\n- [`ML.LABEL_ENCODER`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-label-encoder)\n- [`ML.MULTI_HOT_ENCODER`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-multi-hot-encoder)\n- [`ML.ONE_HOT_ENCODER`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-one-hot-encoder)\n\n### Text functions\n\nUse the following functions on text string expressions:\n\n- [`ML.NGRAMS`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-ngrams)\n- [`ML.BAG_OF_WORDS`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-bag-of-words)\n- [`ML.TF_IDF`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-tf-idf)\n\n### Image functions\n\nUse the following functions on image data:\n\n- [`ML.CONVERT_COLOR_SPACE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-convert-color-space)\n- [`ML.CONVERT_IMAGE_TYPE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-convert-image-type)\n- [`ML.DECODE_IMAGE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-decode-image)\n- [`ML.RESIZE_IMAGE`](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-resize-image)\n\nKnown limitations\n-----------------\n\n- BigQuery ML supports both automatic preprocessing and manual preprocessing in the [model export](/bigquery/docs/exporting-models). See the [supported data types](/bigquery/docs/exporting-models#export-transform-types) and [functions](/bigquery/docs/exporting-models#export-transform-functions) for exporting models trained with the [BigQuery ML `TRANSFORM` clause](/bigquery/docs/bigqueryml-transform)."]]