Decision Tree and its types

Sandeep Sharma
4 min readNov 8, 2021

Q: — What is Decision Tree?
A: — It is a set of rules used to classify data into categories. Basically, it’s a greedy divide and conquer algorithm.

It is a non-parametric supervised learning method that can be used for both classification and regression tasks. Decision tree is a hierarchical tree structure that can be used to divide up a large collection of records into smaller sets of classes by applying a sequence of simple decision rules.

We can lay out the possible outcomes and paths. It helps decision-makers to visualize the big picture of the current situation.

The picture above depicts a decision tree that is used to classify whether a person is Fit or Unfit.

Few Terminologies: —

Root Node: — Initial node in tree.

Leaf/Terminal nodes: — Final nodes. Any node whose left and right children are null. Terminal nodes reside in the last level of a binary tree and don’t have any children.

Intermediate or Internal nodes: — Nodes except Root and leaf nodes.

Branch Sub tree: — Subsection of entire tree

Parent/Child Node: — A node which is divided into sub nodes are called parent node whereas sub nodes are called child node.

Pruning: — To reduce size and complexity of a tree by removing sub nodes. In other words, to overcome over fitness.

Types: — ID3, C4.5, C5.0, CART

ID3 (Iterative Dichotomiser 3): — ID3 decision tree algorithm uses Information Gain to decide the splitting points. In order to measure how much information we gain, we can use entropy to calculate the homogeneity of a sample. It uses a top-down greedy approach to build a decision tree.

In simple words, the top-down approach means that we start building the tree from the top and the greedy approach means that at each iteration we select the best feature at the present moment to create a node.

# ID3 only work with Discrete or nominal data.

ID3 Algorithm

Entropy: — It is a measure of the amount of uncertainty in a data set. Entropy controls how a Decision Tree decides to split the data. It actually affects how a Decision Tree draws its boundaries.

In simple words, how random our dataset is.

Entropy and its formula explanation

Information Gain: — Information Gain is the decrease or increase in Entropy value when the node is split.

information gain

C4.5 and C5.0

C4.5 is the successor to ID3 and C5.0 is the successor to C4.5. Basic difference between C 4.5 and C5.0 is, C5.0 is faster and can be multithreaded, but otherwise generates exactly the same classifiers.

# C4.5 and C5.0 work with both Discrete and Continuous data.

Link: — Is C5.0 Better Than C4.5

CART: — Classification and regression Tree: —

Most popular algorithm in the statistical community. Very similar to C4.5, but it differs in that it supports numerical target variables (regression) and does not compute rule sets. CART uses Gini Impurity.

Gini Impurity is a measure of the purity of the nodes. We compute the probability of the breakage that happens inside the tree.

Gini Impurity

Comparison: —

Comparison between all DT algorithms

Points to be remember : —

The attribute to be made node is selected using:

· The attribute to be made node is selected using:

· # ID3 only work with Discrete or nominal data.

· C4.5 and C5.0 work with both Discrete and Continuous data

· Information gain in ID3

· Gain ratio in C 4.5

· Gini or any other general function in CART.

Past Blogs : —

Ensemble Techniques in Machine Learning
10 alternatives for Cloud based Jupyter notebook
Statistics — Univariate, Bivariate and Multivariate Analysis
Number System in Python

Thank you for reading this article!!

--

--

Sandeep Sharma

Manager Data Science — Coffee Lover — Machine Learning — Statistics — Management Consultant — Product Management — Business Analyst