Recall vs Precision in Confusion Matrix
The confusion matrix provides more insight into not only the performance of a predictive model, but also which classes are being predicted correctly, which incorrectly, and what type of errors are being made.
The precision and recall metrics are defined in terms of the cells in the confusion matrix, specifically terms like true positives and false negatives.
Recall (Sensitivity/ True Positive Rate (TPR)) — How well can you find all the Positive results actually in your data? That’s RECALL!.
In Simple word, Measure of our model correctly identifying True Positives.
Precision (positive predictive value) — When you have a Positive prediction, how often are you correct? That’s PRECISION.
In Simple word, Precision is the ratio between the True Positives and all the Positives.
When to use what?
Sometime people doesn’t know when to use what. In what scenarios we need to use Recall and what scenarios we prefer Precision.
Precision is a good measure to determine, when the costs of False Positive is high. For instance, email spam detection. In email spam detection, a false positive means that an email that is non-spam (actual negative) has been identified as spam (predicted spam). The email user might lose important emails if the precision is not high for the spam detection model.
Recall is a good measurement in Imbalance data. Recall actually calculates how many of the Actual Positives our model capture through labeling it as Positive (True Positive). For instance, in fraud detection. If a fraudulent transaction (Actual Positive) is predicted as non-fraudulent (Predicted Negative), the consequence can be very bad for the bank.
Imbalanced data refers to those types of datasets where the target class has an uneven distribution of observations, i.e. one class label has a very high number of observations and the other has a very low number of observations.
In imbalanced classification data, the majority class is typically referred to as the negative outcome (e.g. such as “No”, “0” or “False” or “negative test result”), and the minority class is typically referred to as the positive outcome (e.g. “Yes”, “True”, 1 or “positive test result”).
Some more lines:
→ For rare cancer data modeling, anything that doesn’t account for false-negatives is a crime. Recall is a better measure than precision.
→ For YouTube recommendations, false-negatives is less of a concern. Precision is better here.
→ PREcision is to PREgnancy tests as reCALL is to CALL center.
Thank you for reading. Links to other blogs: —
The Poisson Distribution
The Geometric and Exponential Distributions
Uniform Distribution
Normal Distribution
Binomial Distribution
Central Limit Theorem
10 alternatives for Cloud based Jupyter notebook!!