Package metrics (1.3.0)

API documentation for metrics package.

Modules

pairwise

API documentation for pairwise module.

Packages Functions

accuracy_score

accuracy_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    *,
    normalize=True
) -> float

Accuracy classification score.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> bpd.options.display.progress_bar = None

>>> y_true = bpd.DataFrame([0, 2, 1, 3])
>>> y_pred = bpd.DataFrame([0, 1, 2, 3])
>>> accuracy_score = bigframes.ml.metrics.accuracy_score(y_true, y_pred)
>>> accuracy_score
0.5

If False, return the number of correctly classified samples:

>>> accuracy_score = bigframes.ml.metrics.accuracy_score(y_true, y_pred, normalize=False)
>>> accuracy_score
2
Parameters
NameDescription
y_true

Ground truth (correct) labels.

y_pred

Predicted labels, as returned by a classifier.

normalize

Default to True. If False, return the number of correctly classified samples. Otherwise, return the fraction of correctly classified samples.

auc

auc(
    x: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
) -> float

Compute Area Under the Curve (AUC) using the trapezoidal rule.

This is a general function, given points on a curve. For computing the area under the ROC-curve, see roc_auc_score. For an alternative way to summarize a precision-recall curve, see average_precision_score.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> bpd.options.display.progress_bar = None

>>> x = bpd.DataFrame([1, 1, 2, 2])
>>> y = bpd.DataFrame([2, 3, 4, 5])
>>> auc = bigframes.ml.metrics.auc(x, y)
>>> auc
3.5

The input can be Series:

>>> df = bpd.DataFrame(
...     {"x": [1, 1, 2, 2],
...      "y": [2, 3, 4, 5],}
... )
>>> auc = bigframes.ml.metrics.auc(df["x"], df["y"])
>>> auc
3.5
Parameters
NameDescription
x

X coordinates. These must be either monotonic increasing or monotonic decreasing.

y

Y coordinates.

confusion_matrix

confusion_matrix(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
) -> pandas.core.frame.DataFrame

Compute confusion matrix to evaluate the accuracy of a classification.

By definition a confusion matrix :math:C is such that :math:C_{i, j} is equal to the number of observations known to be in group :math:i and predicted to be in group :math:j.

Thus in binary classification, the count of true negatives is :math:C_{0,0}, false negatives is :math:C_{1,0}, true positives is :math:C_{1,1} and false positives is :math:C_{0,1}.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> bpd.options.display.progress_bar = None

>>> y_true = bpd.DataFrame([2, 0, 2, 2, 0, 1])
>>> y_pred = bpd.DataFrame([0, 0, 2, 2, 0, 2])
>>> confusion_matrix = bigframes.ml.metrics.confusion_matrix(y_true, y_pred)
>>> confusion_matrix
   0  1  2
0  2  0  0
1  0  0  1
2  1  0  2

>>> y_true = bpd.DataFrame(["cat", "ant", "cat", "cat", "ant", "bird"])
>>> y_pred = bpd.DataFrame(["ant", "ant", "cat", "cat", "ant", "cat"])
>>> confusion_matrix = bigframes.ml.metrics.confusion_matrix(y_true, y_pred)
>>> confusion_matrix
    ant  bird  cat
ant     2     0    0
bird    0     0    1
cat     1     0    2
Parameters
NameDescription
y_true

Ground truth (correct) target values.

y_pred

Estimated targets as returned by a classifier.

f1_score

f1_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    *,
    average: str = "binary"
) -> pandas.core.series.Series

Compute the F1 score, also known as balanced F-score or F-measure.

The F1 score can be interpreted as a harmonic mean of the precision and recall, where an F1 score reaches its best value at 1 and worst score at 0. The relative contribution of precision and recall to the F1 score are equal. The formula for the F1 score is: F1 = 2 * (precision * recall) / (precision + recall).

In the multi-class and multi-label case, this is the average of the F1 score of each class with weighting depending on the average parameter.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> bpd.options.display.progress_bar = None

>>> y_true = bpd.DataFrame([0, 1, 2, 0, 1, 2])
>>> y_pred = bpd.DataFrame([0, 2, 1, 0, 0, 1])
>>> f1_score = bigframes.ml.metrics.f1_score(y_true, y_pred, average=None)
>>> f1_score
0    0.8
1    0.0
2    0.0
dtype: float64
Parameters
NameDescription
y_true

Series or DataFrame of shape (n_samples,). Ground truth (correct) target values.

y_pred

Series or DataFrame of shape (n_samples,). Estimated targets as returned by a classifier.

mean_squared_error

mean_squared_error(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
) -> float

Mean squared error regression loss.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> bpd.options.display.progress_bar = None

>>> y_true = bpd.DataFrame([3, -0.5, 2, 7])
>>> y_pred = bpd.DataFrame([2.5, 0.0, 2, 8])
>>> mse = bigframes.ml.metrics.mean_squared_error(y_true, y_pred)
>>> mse
0.375
Parameters
NameDescription
y_true

Ground truth (correct) target values.

y_pred

Estimated target values.

precision_score

precision_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    *,
    average: str = "binary"
) -> pandas.core.series.Series

Compute the precision.

The precision is the ratio tp / (tp + fp), where tp is the number of true positives and fp the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative.

The best value is 1 and the worst value is 0.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> bpd.options.display.progress_bar = None

>>> y_true = bpd.DataFrame([0, 1, 2, 0, 1, 2])
>>> y_pred = bpd.DataFrame([0, 2, 1, 0, 0, 1])
>>> precision_score = bigframes.ml.metrics.precision_score(y_true, y_pred, average=None)
>>> precision_score
0    0.666667
1    0.000000
2    0.000000
dtype: float64
Parameters
NameDescription
y_true

Series or DataFrame of shape (n_samples,) Ground truth (correct) target values.

y_pred

Series or DataFrame of shape (n_samples,) Estimated targets as returned by a classifier.

r2_score

r2_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    *,
    force_finite=True
) -> float

:math:R^2 (coefficient of determination) regression score function.

Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). In the general case when the true y is non-constant, a constant model that always predicts the average y disregarding the input features would get a :math:R^2 score of 0.0.

In the particular case when y_true is constant, the :math:R^2 score is not finite: it is either NaN (perfect predictions) or -Inf (imperfect predictions). To prevent such non-finite numbers to pollute higher-level experiments such as a grid search cross-validation, by default these cases are replaced with 1.0 (perfect predictions) or 0.0 (imperfect predictions) respectively.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> bpd.options.display.progress_bar = None

>>> y_true = bpd.DataFrame([3, -0.5, 2, 7])
>>> y_pred = bpd.DataFrame([2.5, 0.0, 2, 8])
>>> r2_score = bigframes.ml.metrics.r2_score(y_true, y_pred)
>>> r2_score
0.9486081370449679
Parameters
NameDescription
y_true

Ground truth (correct) target values.

y_pred

Estimated target values.

recall_score

recall_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    *,
    average: str = "binary"
) -> pandas.core.series.Series

Compute the recall.

The recall is the ratio tp / (tp + fn), where tp is the number of true positives and fn the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.

The best value is 1 and the worst value is 0.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> bpd.options.display.progress_bar = None

>>> y_true = bpd.DataFrame([0, 1, 2, 0, 1, 2])
>>> y_pred = bpd.DataFrame([0, 2, 1, 0, 0, 1])
>>> recall_score = bigframes.ml.metrics.recall_score(y_true, y_pred, average=None)
>>> recall_score
0    1
1    0
2    0
dtype: int64
Parameters
NameDescription
y_true

Ground truth (correct) target values.

y_pred

Estimated targets as returned by a classifier.

average

This parameter is required for multiclass/multilabel targets. Possible values are 'None', 'micro', 'macro', 'samples', 'weighted', 'binary'. Only average=None is supported.

roc_auc_score

roc_auc_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_score: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
) -> float

Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> bpd.options.display.progress_bar = None

>>> y_true = bpd.DataFrame([0, 0, 1, 1, 0, 1, 0, 1, 1, 1])
>>> y_score = bpd.DataFrame([0.1, 0.4, 0.35, 0.8, 0.65, 0.9, 0.5, 0.3, 0.6, 0.45])
>>> roc_auc_score = bigframes.ml.metrics.roc_auc_score(y_true, y_score)
>>> roc_auc_score
0.625

The input can be Series:

>>> df = bpd.DataFrame(
...     {"y_true": [0, 0, 1, 1, 0, 1, 0, 1, 1, 1],
...      "y_score": [0.1, 0.4, 0.35, 0.8, 0.65, 0.9, 0.5, 0.3, 0.6, 0.45],}
... )
>>> roc_auc_score = bigframes.ml.metrics.roc_auc_score(df["y_true"], df["y_score"])
>>> roc_auc_score
0.625
Parameters
NameDescription
y_true

True labels or binary label indicators. The binary and multiclass cases expect labels with shape (n_samples,) while the multilabel case expects binary label indicators with shape (n_samples, n_classes).

y_score

Target scores. * In the binary case, it corresponds to an array of shape (n_samples,). Both probability estimates and non-thresholded decision values can be provided. The probability estimates correspond to the probability of the class with the greater label, i.e. estimator.classes_[1] and thus estimator.predict_proba(X, y)[:, 1]. The decision values corresponds to the output of estimator.decision_function(X, y).

roc_curve

roc_curve(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_score: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    *,
    drop_intermediate: bool = True
) -> typing.Tuple[
    bigframes.series.Series, bigframes.series.Series, bigframes.series.Series
]

Compute Receiver operating characteristic (ROC).

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> bpd.options.display.progress_bar = None

>>> y_true = bpd.DataFrame([1, 1, 2, 2])
>>> y_score = bpd.DataFrame([0.1, 0.4, 0.35, 0.8])
>>> fpr, tpr, thresholds = bigframes.ml.metrics.roc_curve(y_true, y_score, drop_intermediate=False)
>>> fpr
0    0.0
1    0.0
2    0.0
3    0.0
4    0.0
Name: fpr, dtype: Float64

>>> tpr
0         0.0
1    0.333333
2         0.5
3    0.833333
4         1.0
Name: tpr, dtype: Float64

>>> thresholds
0     inf
1     0.8
2     0.4
3    0.35
4     0.1
Name: thresholds, dtype: Float64
Parameters
NameDescription
y_true

Series or DataFrame of shape (n_samples,) True binary labels. If labels are not either {-1, 1} or {0, 1}, then pos_label should be explicitly given.

y_score

Series or DataFrame of shape (n_samples,) Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers).