Confusion Matrix in Machine Learning

Sandeep Sharma
3 min readMay 2, 2022

--

What is Confusion Matrix?

A confusion matrix is a performance measurement (often represented in the form of a matrix) technique for Machine learning classification models.

A confusion matrix is a two-dimensional matrix with two rows and two columns where rows are represented with the predicted values and columns with actual values.

For a binary classification problem, the target variable will have two values which are called actual values. It can be represented as 1 and 0 / True or False / Yes or No.

Terminologies in Confusion Matrix : —

True Positives (TP), True Negatives (TN), False Positives (FP), False Negatives (FN)

True Positives (TP), True Negatives (TN), False Positives (FP), False Negatives (FN)

True Positives (TP): True Positives gives the count of cases of those data samples whose actual value will be 1 and the predicted value will also be 1.
#Positive values being predicted correctly as Positive Values.

True Negatives (TN): True Negatives gives the count of cases of those data samples whose actual value will be 0 and the predicted value will also be 0.
#Negative values being predicted correctly as Negative Values.

False Positives (FP): Type 1 Error : False Positives gives the count of cases of those data samples whose actual value will be 0 and the predicted value will be 1. # Negative values being predicted incorrectly as Positive Values.

False Negatives (FN): Type 2 Error : False Negatives gives the count of cases of those data samples whose actual value will be 1 and the predicted value will be 0. # Negative values being predicted incorrectly as Positive Values.

A type I error occurs when the null hypothesis is true, but is rejected.
A type II error occurs when the null hypothesis is false, but is not rejected.

Type I and Type II Error

Other Things which we can pull from Confusion Matrix: —

other matrices

Recall and Precision: — Recall and precision are perhaps the most commonly used measures of performance for predictive classifiers.

Recall (Sensitivity/ True Positive Rate (TPR)) — How well can you find all the Positive results actually in your data? That’s RECALL!.
In Simple word, proportion of TP in the set of all True Values.

Precision (positive predictive value) — When you have a Positive prediction, how often are you correct? That’s PRECISION.
In Simple word, Proportion of TP in the set of predicted positive values

Recall vs Precision

Recall and Precision both are inversely proportional to each other. If Recall will go up, precision will come down. If precision will go up than recall will come down. So in order to balance the same, we use F1-Score.

F1 Score

Thank you for reading. Links to other blogs: —

The Poisson Distribution
The Geometric and Exponential Distributions
Uniform Distribution
Normal Distribution
Binomial Distribution
Central Limit Theorem
10 alternatives for Cloud based Jupyter notebook!!

--

--

Sandeep Sharma

Manager Data Science — Coffee Lover — Machine Learning — Statistics — Management Consultant — Product Management — Business Analyst