REST Resource: models

Resource: Model

JSON representation
{
  "etag": string,
  "modelReference": {
    object (ModelReference)
  },
  "creationTime": string,
  "lastModifiedTime": string,
  "description": string,
  "friendlyName": string,
  "labels": {
    string: string,
    ...
  },
  "expirationTime": string,
  "location": string,
  "encryptionConfiguration": {
    object (EncryptionConfiguration)
  },
  "modelType": enum (ModelType),
  "trainingRuns": [
    {
      object (TrainingRun)
    }
  ],
  "featureColumns": [
    {
      object (StandardSqlField)
    }
  ],
  "labelColumns": [
    {
      object (StandardSqlField)
    }
  ]
}
Fields
etag

string

Output only. A hash of this resource.

modelReference

object (ModelReference)

Required. Unique identifier for this model.

creationTime

string (int64 format)

Output only. The time when this model was created, in millisecs since the epoch.

lastModifiedTime

string (int64 format)

Output only. The time when this model was last modified, in millisecs since the epoch.

description

string

Optional. A user-friendly description of this model.

friendlyName

string

Optional. A descriptive name for this model.

labels

map (key: string, value: string)

The labels associated with this model. You can use these to organize and group your models. Label keys and values can be no longer than 63 characters, can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.

expirationTime

string (int64 format)

Optional. The time when this model expires, in milliseconds since the epoch. If not present, the model will persist indefinitely. Expired models will be deleted and their storage reclaimed. The defaultTableExpirationMs property of the encapsulating dataset can be used to set a default expirationTime on newly created models.

location

string

Output only. The geographic location where the model resides. This value is inherited from the dataset.

encryptionConfiguration

object (EncryptionConfiguration)

Custom encryption configuration (e.g., Cloud KMS keys). This shows the encryption configuration of the model data while stored in BigQuery storage.

modelType

enum (ModelType)

Output only. Type of the model resource.

trainingRuns[]

object (TrainingRun)

Output only. Information for all training runs in increasing order of startTime.

featureColumns[]

object (StandardSqlField)

Output only. Input feature columns that were used to train this model.

labelColumns[]

object (StandardSqlField)

Output only. Label columns that were used to train this model. The output of the model will have a "predicted_" prefix to these columns.

ModelReference

Id path of a model.

JSON representation
{
  "projectId": string,
  "datasetId": string,
  "modelId": string
}
Fields
projectId

string

Required. The ID of the project containing this model.

datasetId

string

Required. The ID of the dataset containing this model.

modelId

string

Required. The ID of the model. The ID must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_). The maximum length is 1,024 characters.

ModelType

Indicates the type of the Model.

Enums
MODEL_TYPE_UNSPECIFIED
LINEAR_REGRESSION Linear regression model.
LOGISTIC_REGRESSION Logistic regression based classification model.
KMEANS K-means clustering model.
TENSORFLOW [Beta] An imported TensorFlow model.

TrainingRun

Information about a single training query run for the model.

JSON representation
{
  "trainingOptions": {
    object (TrainingOptions)
  },
  "startTime": string,
  "results": [
    {
      object (IterationResult)
    }
  ],
  "evaluationMetrics": {
    object (EvaluationMetrics)
  }
}
Fields
trainingOptions

object (TrainingOptions)

Options that were used for this training run, includes user specified and default options that were used.

startTime

string (Timestamp format)

The start time of this training run.

results[]

object (IterationResult)

Output of each iteration run, results.size() <= maxIterations.

evaluationMetrics

object (EvaluationMetrics)

The evaluation metrics over training/eval data that were computed at the end of training.

TrainingOptions

JSON representation
{
  "maxIterations": string,
  "lossType": enum (LossType),
  "learnRate": number,
  "l1Regularization": number,
  "l2Regularization": number,
  "minRelativeProgress": number,
  "warmStart": boolean,
  "earlyStop": boolean,
  "inputLabelColumns": [
    string
  ],
  "dataSplitMethod": enum (DataSplitMethod),
  "dataSplitEvalFraction": number,
  "dataSplitColumn": string,
  "learnRateStrategy": enum (LearnRateStrategy),
  "initialLearnRate": number,
  "labelClassWeights": {
    string: number,
    ...
  },
  "distanceType": enum (DistanceType),
  "numClusters": string,
  "modelUri": string,
  "optimizationStrategy": enum (OptimizationStrategy),
  "kmeansInitializationMethod": enum (KmeansInitializationMethod),
  "kmeansInitializationColumn": string
}
Fields
maxIterations

string (int64 format)

The maximum number of iterations in training. Used only for iterative training algorithms.

lossType

enum (LossType)

Type of loss function used during training run.

learnRate

number

Learning rate in training. Used only for iterative training algorithms.

l1Regularization

number

L1 regularization coefficient.

l2Regularization

number

L2 regularization coefficient.

minRelativeProgress

number

When earlyStop is true, stops training when accuracy improvement is less than 'minRelativeProgress'. Used only for iterative training algorithms.

warmStart

boolean

Whether to train a model from the last checkpoint.

earlyStop

boolean

Whether to stop early when the loss doesn't improve significantly any more (compared to minRelativeProgress). Used only for iterative training algorithms.

inputLabelColumns[]

string

Name of input label columns in training data.

dataSplitMethod

enum (DataSplitMethod)

The data split type for training and evaluation, e.g. RANDOM.

dataSplitEvalFraction

number

The fraction of evaluation data over the whole input data. The rest of data will be used as training data. The format should be double. Accurate to two decimal places. Default value is 0.2.

dataSplitColumn

string

The column to split data with. This column won't be used as a feature. 1. When dataSplitMethod is CUSTOM, the corresponding column should be boolean. The rows with true value tag are eval data, and the false are training data. 2. When dataSplitMethod is SEQ, the first DATA_SPLIT_EVAL_FRACTION rows (from smallest to largest) in the corresponding column are used as training data, and the rest are eval data. It respects the order in Orderable data types: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#data-type-properties

learnRateStrategy

enum (LearnRateStrategy)

The strategy to determine learn rate for the current iteration.

initialLearnRate

number

Specifies the initial learning rate for the line search learn rate strategy.

labelClassWeights

map (key: string, value: number)

Weights associated with each label class, for rebalancing the training data. Only applicable for classification models.

An object containing a list of "key": value pairs. Example: { "name": "wrench", "mass": "1.3kg", "count": "3" }.

distanceType

enum (DistanceType)

Distance type for clustering models.

numClusters

string (int64 format)

Number of clusters for clustering models.

modelUri

string

[Beta] Google Cloud Storage URI from which the model was imported. Only applicable for imported models.

optimizationStrategy

enum (OptimizationStrategy)

Optimization strategy for training linear regression models.

kmeansInitializationMethod

enum (KmeansInitializationMethod)

The method used to initialize the centroids for kmeans algorithm.

kmeansInitializationColumn

string

The column used to provide the initial centroids for kmeans algorithm when kmeansInitializationMethod is CUSTOM.

LossType

Loss metric to evaluate model training performance.

Enums
LOSS_TYPE_UNSPECIFIED
MEAN_SQUARED_LOSS Mean squared loss, used for linear regression.
MEAN_LOG_LOSS Mean log loss, used for logistic regression.

DataSplitMethod

Indicates the method to split input data into multiple tables.

Enums
DATA_SPLIT_METHOD_UNSPECIFIED
RANDOM Splits data randomly.
CUSTOM Splits data with the user provided tags.
SEQUENTIAL Splits data sequentially.
NO_SPLIT Data split will be skipped.
AUTO_SPLIT Splits data automatically: Uses NO_SPLIT if the data size is small. Otherwise uses RANDOM.

LearnRateStrategy

Indicates the learning rate optimization strategy to use.

Enums
LEARN_RATE_STRATEGY_UNSPECIFIED
CONSTANT Use a constant learning rate.

DistanceType

Distance metric used to compute the distance between two points.

Enums
DISTANCE_TYPE_UNSPECIFIED
EUCLIDEAN Eculidean distance.
COSINE Cosine distance.

OptimizationStrategy

Indicates the optimization strategy used for training.

Enums
OPTIMIZATION_STRATEGY_UNSPECIFIED
BATCH_GRADIENT_DESCENT Uses an iterative batch gradient descent algorithm.
NORMAL_EQUATION Uses a normal equation to solve linear regression problem.

KmeansInitializationMethod

Indicates the method used to initialize the centroids for KMeans clustering algorithm.

Enums
KMEANS_INITIALIZATION_METHOD_UNSPECIFIED
RANDOM Initializes the centroids randomly.
CUSTOM Initializes the centroids using data specified in kmeansInitializationColumn.

IterationResult

Information about a single iteration of the training run.

JSON representation
{
  "index": number,
  "durationMs": string,
  "trainingLoss": number,
  "evalLoss": number,
  "learnRate": number,
  "clusterInfos": [
    {
      object (ClusterInfo)
    }
  ],
  "arimaResult": {
    object (ArimaResult)
  }
}
Fields
index

number

Index of the iteration, 0 based.

durationMs

string (Int64Value format)

Time taken to run the iteration in milliseconds.

trainingLoss

number

Loss computed on the training data at the end of iteration.

evalLoss

number

Loss computed on the eval data at the end of iteration.

learnRate

number

Learn rate used for this iteration.

clusterInfos[]

object (ClusterInfo)

Information about top clusters for clustering models.

arimaResult

object (ArimaResult)

ClusterInfo

Information about a single cluster for clustering model.

JSON representation
{
  "centroidId": string,
  "clusterRadius": number,
  "clusterSize": string
}
Fields
centroidId

string (int64 format)

Centroid id.

clusterRadius

number

Cluster radius, the average distance from centroid to each point assigned to the cluster.

clusterSize

string (Int64Value format)

Cluster size, the total number of points assigned to the cluster.

ArimaResult

(Auto-)arima fitting result. Wrap everything in ArimaResult for easier refactoring if we want to use model-specific iteration results.

JSON representation
{
  "arimaModelInfo": [
    {
      object (ArimaModelInfo)
    }
  ],
  "seasonalPeriods": [
    enum (SeasonalPeriodType)
  ]
}
Fields
arimaModelInfo[]

object (ArimaModelInfo)

This message is repeated because there are multiple arima models fitted in auto-arima. For non-auto-arima model, its size is one.

seasonalPeriods[]

enum (SeasonalPeriodType)

Seasonal periods. Repeated because multiple periods are supported for one time series.

ArimaModelInfo

Arima model information.

JSON representation
{
  "nonSeasonalOrder": {
    object (ArimaOrder)
  },
  "arimaCoefficients": {
    object (ArimaCoefficients)
  },
  "arimaFittingMetrics": {
    object (ArimaFittingMetrics)
  }
}
Fields
nonSeasonalOrder

object (ArimaOrder)

Non-seasonal order.

arimaCoefficients

object (ArimaCoefficients)

Arima coefficients.

arimaFittingMetrics

object (ArimaFittingMetrics)

Arima fitting metrics.

ArimaOrder

Arima order, can be used for both non-seasonal and seasonal parts.

JSON representation
{
  "p": string,
  "d": string,
  "q": string
}
Fields
p

string (int64 format)

Order of the autoregressive part.

d

string (int64 format)

Order of the differencing part.

q

string (int64 format)

Order of the moving-average part.

ArimaCoefficients

Arima coefficients.

JSON representation
{
  "autoRegressiveCoefficients": [
    number
  ],
  "movingAverageCoefficients": [
    number
  ],
  "interceptCoefficient": number
}
Fields
autoRegressiveCoefficients[]

number

Auto-regressive coefficients, an array of double.

movingAverageCoefficients[]

number

Moving-average coefficients, an array of double.

interceptCoefficient

number

Intercept coefficient, just a double not an array.

ArimaFittingMetrics

ARIMA model fitting metrics.

JSON representation
{
  "logLikelihood": number,
  "aic": number,
  "variance": number
}
Fields
logLikelihood

number

log-likelihood

aic

number

AIC

variance

number

variance.

SeasonalPeriodType

Enums
SEASONAL_PERIOD_TYPE_UNSPECIFIED
NO_SEASONALITY No seasonality
DAILY Daily period, 24 hours.
WEEKLY Weekly period, 7 days.
MONTHLY Monthly period, can be as 30 days or irregular.
QUARTERLY Quarterly period, can be as 90 days or irregular.
YEARLY Yearly period, can be as 365 days or irregular.

EvaluationMetrics

Evaluation metrics of a model. These are either computed on all training data or just the eval data based on whether eval data was used during training. These are not present for imported models.

JSON representation
{

  // Union field metrics can be only one of the following:
  "regressionMetrics": {
    object (RegressionMetrics)
  },
  "binaryClassificationMetrics": {
    object (BinaryClassificationMetrics)
  },
  "multiClassClassificationMetrics": {
    object (MultiClassClassificationMetrics)
  },
  "clusteringMetrics": {
    object (ClusteringMetrics)
  }
  // End of list of possible types for union field metrics.
}
Fields

Union field metrics.

metrics can be only one of the following:

regressionMetrics

object (RegressionMetrics)

Populated for regression models and explicit feedback type matrix factorization models.

binaryClassificationMetrics

object (BinaryClassificationMetrics)

Populated for binary classification/classifier models.

multiClassClassificationMetrics

object (MultiClassClassificationMetrics)

Populated for multi-class classification/classifier models.

clusteringMetrics

object (ClusteringMetrics)

Populated for clustering models.

RegressionMetrics

Evaluation metrics for regression and explicit feedback type matrix factorization models.

JSON representation
{
  "meanAbsoluteError": number,
  "meanSquaredError": number,
  "meanSquaredLogError": number,
  "medianAbsoluteError": number,
  "rSquared": number
}
Fields
meanAbsoluteError

number

Mean absolute error.

meanSquaredError

number

Mean squared error.

meanSquaredLogError

number

Mean squared log error.

medianAbsoluteError

number

Median absolute error.

rSquared

number

R^2 score.

BinaryClassificationMetrics

Evaluation metrics for binary classification/classifier models.

JSON representation
{
  "aggregateClassificationMetrics": {
    object (AggregateClassificationMetrics)
  },
  "binaryConfusionMatrixList": [
    {
      object (BinaryConfusionMatrix)
    }
  ],
  "positiveLabel": string,
  "negativeLabel": string
}
Fields
aggregateClassificationMetrics

object (AggregateClassificationMetrics)

Aggregate classification metrics.

binaryConfusionMatrixList[]

object (BinaryConfusionMatrix)

Binary confusion matrix at multiple thresholds.

positiveLabel

string

Label representing the positive class.

negativeLabel

string

Label representing the negative class.

AggregateClassificationMetrics

Aggregate metrics for classification/classifier models. For multi-class models, the metrics are either macro-averaged or micro-averaged. When macro-averaged, the metrics are calculated for each label and then an unweighted average is taken of those values. When micro-averaged, the metric is calculated globally by counting the total number of correctly predicted rows.

JSON representation
{
  "precision": number,
  "recall": number,
  "accuracy": number,
  "threshold": number,
  "f1Score": number,
  "logLoss": number,
  "rocAuc": number
}
Fields
precision

number

Precision is the fraction of actual positive predictions that had positive actual labels. For multiclass this is a macro-averaged metric treating each class as a binary classifier.

recall

number

Recall is the fraction of actual positive labels that were given a positive prediction. For multiclass this is a macro-averaged metric.

accuracy

number

Accuracy is the fraction of predictions given the correct label. For multiclass this is a micro-averaged metric.

threshold

number

Threshold at which the metrics are computed. For binary classification models this is the positive class threshold. For multi-class classfication models this is the confidence threshold.

f1Score

number

The F1 score is an average of recall and precision. For multiclass this is a macro-averaged metric.

logLoss

number

Logarithmic Loss. For multiclass this is a macro-averaged metric.

rocAuc

number

Area Under a ROC Curve. For multiclass this is a macro-averaged metric.

BinaryConfusionMatrix

Confusion matrix for binary classification models.

JSON representation
{
  "positiveClassThreshold": number,
  "truePositives": string,
  "falsePositives": string,
  "trueNegatives": string,
  "falseNegatives": string,
  "precision": number,
  "recall": number,
  "f1Score": number,
  "accuracy": number
}
Fields
positiveClassThreshold

number

Threshold value used when computing each of the following metric.

truePositives

string (Int64Value format)

Number of true samples predicted as true.

falsePositives

string (Int64Value format)

Number of false samples predicted as true.

trueNegatives

string (Int64Value format)

Number of true samples predicted as false.

falseNegatives

string (Int64Value format)

Number of false samples predicted as false.

precision

number

The fraction of actual positive predictions that had positive actual labels.

recall

number

The fraction of actual positive labels that were given a positive prediction.

f1Score

number

The equally weighted average of recall and precision.

accuracy

number

The fraction of predictions given the correct label.

MultiClassClassificationMetrics

Evaluation metrics for multi-class classification/classifier models.

JSON representation
{
  "aggregateClassificationMetrics": {
    object (AggregateClassificationMetrics)
  },
  "confusionMatrixList": [
    {
      object (ConfusionMatrix)
    }
  ]
}
Fields
aggregateClassificationMetrics

object (AggregateClassificationMetrics)

Aggregate classification metrics.

confusionMatrixList[]

object (ConfusionMatrix)

Confusion matrix at different thresholds.

ConfusionMatrix

Confusion matrix for multi-class classification models.

JSON representation
{
  "confidenceThreshold": number,
  "rows": [
    {
      object (Row)
    }
  ]
}
Fields
confidenceThreshold

number

Confidence threshold used when computing the entries of the confusion matrix.

rows[]

object (Row)

One row per actual label.

Row

A single row in the confusion matrix.

JSON representation
{
  "actualLabel": string,
  "entries": [
    {
      object (Entry)
    }
  ]
}
Fields
actualLabel

string

The original label of this row.

entries[]

object (Entry)

Info describing predicted label distribution.

Entry

A single entry in the confusion matrix.

JSON representation
{
  "predictedLabel": string,
  "itemCount": string
}
Fields
predictedLabel

string

The predicted label. For confidenceThreshold > 0, we will also add an entry indicating the number of items under the confidence threshold.

itemCount

string (Int64Value format)

Number of items being predicted as this label.

ClusteringMetrics

Evaluation metrics for clustering models.

JSON representation
{
  "daviesBouldinIndex": number,
  "meanSquaredDistance": number,
  "clusters": [
    {
      object (Cluster)
    }
  ]
}
Fields
daviesBouldinIndex

number

Davies-Bouldin index.

meanSquaredDistance

number

Mean of squared distances between each sample to its cluster centroid.

clusters[]

object (Cluster)

[Beta] Information for all clusters.

Cluster

Message containing the information about one cluster.

JSON representation
{
  "centroidId": string,
  "featureValues": [
    {
      object (FeatureValue)
    }
  ],
  "count": string
}
Fields
centroidId

string (int64 format)

Centroid id.

featureValues[]

object (FeatureValue)

Values of highly variant features for this cluster.

count

string (Int64Value format)

Count of training data rows that were assigned to this cluster.

FeatureValue

Representative value of a single feature within the cluster.

JSON representation
{
  "featureColumn": string,

  // Union field value can be only one of the following:
  "numericalValue": number,
  "categoricalValue": {
    object (CategoricalValue)
  }
  // End of list of possible types for union field value.
}
Fields
featureColumn

string

The feature column name.

Union field value.

value can be only one of the following:

numericalValue

number

The numerical feature value. This is the centroid value for this feature.

categoricalValue

object (CategoricalValue)

The categorical feature value.

CategoricalValue

Representative value of a categorical feature.

JSON representation
{
  "categoryCounts": [
    {
      object (CategoryCount)
    }
  ]
}
Fields
categoryCounts[]

object (CategoryCount)

Counts of all categories for the categorical feature. If there are more than ten categories, we return top ten (by count) and return one more CategoryCount with category "_OTHER_" and count as aggregate counts of remaining categories.

CategoryCount

Represents the count of a single category within the cluster.

JSON representation
{
  "category": string,
  "count": string
}
Fields
category

string

The name of category.

count

string (Int64Value format)

The count of training samples matching the category within the cluster.

Methods

delete

Deletes the model specified by modelId from the dataset.

get

Gets the specified model resource by model ID.

list

Lists all models in the specified dataset.

patch

Patch specific fields in the specified model.
Var denne siden nyttig? Si fra hva du synes:

Send tilbakemelding om ...