pandas_ml.confusion_matrix package

Submodules

class pandas_ml.confusion_matrix.abstract.ConfusionMatrixAbstract(y_true, y_pred, labels=None, display_sum=True, backend='matplotlib', true_name='Actual', pred_name='Predicted')

Bases: object

Abstract class for confusion matrix

You shouldn’t instantiate this class. You might instantiate ConfusionMatrix or BinaryConfusionMatrix classes

PRED_NAME = 'Predicted'
TRUE_NAME = 'Actual'
binarize(select)

Returns a binary confusion matrix from a confusion matrix

classes

Returns classes (property)

classification_report

Returns a DataFrame with classification report

enlarge(select)

Enlarges confusion matrix with new classes It should add empty rows and columns

get(actual=None, predicted=None)

Get confusion matrix value for a given actual class and a given predicted class

if only one parameter is given (actual or predicted) we get confusion matrix value for actual=actual and predicted=actual

is_binary

Return False

len()

Returns len of a confusion matrix. For example: 3 means that this is a 3x3 (3 rows, 3 columns) matrix

max()

Returns max value of confusion matrix

min()

Returns min value of confusion matrix

plot(normalized=False, backend='matplotlib', ax=None, max_colors=10, **kwargs)

Plots confusion matrix

population

see also sum

pred

Returns sum of predicted values for each class

print_stats(lst_stats=None)

Prints statistics

stats(lst_stats=None)

Return an OrderedDict with statistics

stats_class

Returns a DataFrame with class statistics

stats_overall

Returns an OrderedDict with overall statistics

sum()

Returns sum of a confusion matrix. Also called “population” It should be the number of elements of either y_true or y_pred

title

Returns title

to_array(normalized=False, sum=False)

Returns a Numpy Array

to_dataframe(normalized=False, calc_sum=False, sum_label='__all__')

Returns a Pandas DataFrame

toarray(*args, **kwargs)

see to_array

true

Returns sum of actual (true) values for each class

y_pred(func=None)
y_true(func=None)
class pandas_ml.confusion_matrix.bcm.BinaryConfusionMatrix(*args, **kwargs)

Bases: pandas_ml.confusion_matrix.abstract.ConfusionMatrixAbstract

Binary confusion matrix class

ACC

accuracy (ACC) ACC = (TP + TN) / (P + N) = (TP + TN) / TotalPopulation

DOR

Diagnostic odds ratio (DOR) = LR+ / LR−

F1_score

F1 score is the harmonic mean of precision and sensitivity F1 = 2 TP / (2 TP + FP + FN) can be also F1 = 2 * (precision * recall) / (precision + recall)

FDR

false discovery rate (FDR) FDR = FP / (FP + TP) = 1 - PPV

FN

false negative (FN) eqv. with miss, Type II error / Type 2 error

FNR

Miss Rate or False Negative Rate (FNR) FNR = FN / P = FN / (FN + TP)

FOR

false omission rate (FOR) FOR = FN / NegativeTest

FP

false positive (FP) eqv. with false alarm, Type I error / Type 1 error

FPR

false positive rate (FPR) eqv. with fall-out FPR = FP / N = FP / (FP + TN)

LRN

Negative likelihood ratio (LR-) = FNR / TNR

LRP

Positive likelihood ratio (LR+) = TPR / FPR

MCC

Matthews correlation coefficient (MCC)

N

Condition negative

NPV

negative predictive value (NPV) NPV = TN / (TN + FN)

NegativeTest

test outcome negative TN + FN

P

Condition positive eqv. with support

PPV

positive predictive value (PPV) eqv. with precision PPV = TP / (TP + FP) = TP / PositiveTest

PositiveTest

test outcome positive TP + FP

SPC

same as TNR

TN

true negative (TN) eqv. with correct rejection

TNR

specificity (SPC) or true negative rate (TNR) SPC = TN / N = TN / (FP + TN)

TP

true positive (TP) eqv. with hit

TPR

true positive rate (TPR) eqv. with hit rate, recall, sensitivity TPR = TP / P = TP / (TP+FN)

dict_class(reversed=False)
classmethod help()

Returns a DataFrame reminder about terms * TN: True Negative * FP: False Positive * FN: False Negative * TP: True Positive

hit

same as TP

informedness

Informedness = Sensitivity + Specificity - 1

inverse()

Inverses a binary confusion matrix False -> True True -> False

is_binary

Return True

markedness

Markedness = Precision + NPV - 1

neg_class

Returns negative class If BinaryConfusionMatrix was instantiate using y_true and y_pred as array of booleans, it should return False Else it should return the name (string) of the negative class

pos_class

Returns positive class If BinaryConfusionMatrix was instantiate using y_true and y_pred as array of booleans, it should return True Else it should return the name (string) of the positive class

precision

same as PPV

prevalence

Prevalence = P / TotalPopulation

recall

same as TPR

sensitivity

same as TPR

specificity

same as TNR

stats(lst_stats=None)

Returns an ordered dictionary of statistics

support

same as P

y_pred(to_bool=False)
y_true(to_bool=False)
class pandas_ml.confusion_matrix.cm.ConfusionMatrix(y_true, y_pred, labels=None, display_sum=True, backend='matplotlib', true_name='Actual', pred_name='Predicted')

Bases: pandas_ml.confusion_matrix.abstract.ConfusionMatrixAbstract

class pandas_ml.confusion_matrix.cm.LabeledConfusionMatrix(y_true, y_pred, labels=None, display_sum=True, backend='matplotlib', true_name='Actual', pred_name='Predicted')

Bases: pandas_ml.confusion_matrix.abstract.ConfusionMatrixAbstract

Confusion matrix class (not binary)

pandas_ml.confusion_matrix.stats.binom_interval(success, total, confint=0.95)

Compute two-sided binomial confidence interval in Python. Based on R’s binom.test.

pandas_ml.confusion_matrix.stats.choose(n, k)

A fast way to calculate binomial coefficients by Andrew Dalke (contrib).

pandas_ml.confusion_matrix.stats.class_agreement(df)

Inspired from R package e1071 matchClassed.R classAgreement

pandas_ml.confusion_matrix.stats.prop_test(df)

Inspired from R package caret confusionMatrix.R

Module contents