Hey folks! In this article, we will be having a look at one of the most interesting concepts in the domain of Machine Learning and Artificial Intelligence โ **Decision Tree**.

Machine Learning is a vast domain that contains numerous algorithms to train the data on and predict the outcome of a model and thus increase the business standards and strategies.

Decision Tree delivers insights about the data in terms of predictions and probabilistic outcomes.

So, let us begin!

## What is a Decision Tree?

`Decision Tree`

is a Machine Learning Algorithm that makes use of a model of decisions and provides an outcome/prediction of an event in terms of chances or probabilities.

It is a non-parametric and predictive algorithm that delivers the outcome based on the modeling of certain decisions/rules framed from observing the traits in the data.

Being a supervised Machine Learning algorithm, Decision Trees **learn from the historic data**. So, the algorithm trains and builds the model of decisions based on it, and then predicts the outcome.

Decision Trees can perform either of the following tasks:

`Classification`

of a record based on probabilities to which category the records belongs to.`Estimation`

of the value of a certain target value.

We will learn more about the above tasks in the upcoming section.

In programming terms, a Decision Tree represents a flowchart-resembling structure which delivers the outcome based on the probabilistic conditions.

- The
`internal node`

represents the`condition`

on the attributes/variables. - Every
`branch`

represents the`outcome value`

of the condition. - The
`leaf nodes`

gives information about the`decision`

to be taken after resolving all the conditions of every attribute.

Decision Tree uses various algorithms such as `ID3`

, `CART`

, `C5.0`

, etc to identify the best attribute to be placed as the root node value that signifies the best homogeneous set of data variables.

It begins with the comparison between the root node and the attributes of the tree. The comparison value evaluates the model of decisions. This continues until it reaches the leaf node with the predicted outcome classification or estimation value.

Now, let us have a look at the different types of Decision Trees in detail.

## Types of Decision Tree

Decision Tree predicts the following types of values:

**Classification**: It predicts the class of the target value for the categorical data variables. For example, prediction if the attribute gives the outcome as YES or NO.**Regression**: It predicts the outcome value of the target variable for the numeric/continuous variables. For example, prediction of the Bike Rental Count in the near time.

## Assumptions of a Decision Tree

- Before the first iteration, i.e. before defining the model of decisions, the entire training dataset is assumed as the value for the root node.
- The data values/attributes are distributed recursively.
- A statistical approach is assumed to be used to finalize the order of placing the attribute as root/attribute node values.

## Implementation of Decision Tree using Python

Having understood the working of Decision Trees, let us now implement the same in Python.

**Please visit the below link to find the entire dataset.**

This data set contains a target variable โ โcntโ. It is a regression problem because here we have to predict the count of customer who would rent a bike in the near time based on certain conditions.

In the below piece of code, we have imported the necessary libraries such as os, numpy, pandas, sklearn, etc.

1 2 3 4 5 6 |
import os import pandas as pd import numpy as np from sklearn.model_selection import train_test_split import numpy as np from sklearn.tree import DecisionTreeRegressor |

Now, we have defined a function to judge the error value of the performance of the Decision Tree.

We have used `MAPE error metrics`

for this algorithm as this is a regression evaluation problem.

1 2 3 |
def MAPE(Y_actual,Y_Predicted): Mape = np.mean(np.abs((Y_actual - Y_Predicted)/Y_actual))*100 return Mape |

Taking it ahead, we segregate the dataset into two data sets comprising of independent and dependent variables. Further, we use the `train_test_split()`

function to divide the dataset into training and testing data.

1 2 3 4 5 |
bike = pd.read_csv("Bike.csv") X = bike.drop(['cnt'],axis=1) Y = bike['cnt'] # Splitting the dataset into 80% training data and 20% testing data. X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=.20, random_state=0) |

Now is the time to apply the decision tree algorithm. We used `DecisionTreeRegressor()`

function to build and train the model.

Further, we predict the values of the target variable in the testing dataset using the predict() function.

After this, we calculate the MAPE value (error value) and subtract it from 100 to get the accuracy of the model.

1 2 3 4 5 6 7 |
DT_model = DecisionTreeRegressor(max_depth=5).fit(X_train,Y_train) DT_predict = DT_model.predict(X_test) #Predictions on Testing data DT_MAPE = MAPE(Y_test,DT_predict) Accuracy_DT = 100 - DT_MAPE print("MAPE: ",DT_MAPE) print('Accuracy of Decision Tree model: {:0.2f}%.'.format(Accuracy_DT)) print("Predicted values:n",DT_predict) |

**Output:**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
<span style="color: #008000;"><strong>MAPE: 18.076637888252062 Accuracy of Decision Tree model: 81.92%. Predicted values: [3488.27906977 4623.46721311 2424. 4623.46721311 4883.19047619 6795.11 7686.75 1728.12121212 2602.66666667 2755.17391304 4623.46721311 4623.46721311 1728.12121212 6795.11 5971.14285714 1249.52 4623.46721311 6795.11 3488.27906977 5971.14285714 3811.63636364 2424. 1249.52 6795.11 6795.11 3936.34375 3978.5 1728.12121212 6795.11 1728.12121212 3488.27906977 6795.11 6795.11 5971.14285714 1728.12121212 4623.46721311 1249.52 3936.34375 3219.58333333 7686.75 4623.46721311 2679.5 6795.11 2755.17391304 6060.78571429 6795.11 4623.46721311 6795.11 6795.11 4623.46721311 4883.19047619 4623.46721311 2602.66666667 4623.46721311 7686.75 6795.11 7099.5 1728.12121212 6795.11 1249.52 4623.46721311 6795.11 4623.46721311 1728.12121212 4883.19047619 4623.46721311 6795.11 6795.11 4623.46721311 1728.12121212 4623.46721311 4623.46721311 1178.5 6795.11 4623.46721311 6795.11 6795.11 1728.12121212 3488.27906977 6795.11 5971.14285714 6060.78571429 4623.46721311 3936.34375 1249.52 1728.12121212 4676.33333333 3936.34375 4623.46721311 6795.11 4623.46721311 4623.46721311 6795.11 5971.14285714 4883.19047619 4883.19047619 3488.27906977 6795.11 6795.11 6795.11 3811.63636364 6795.11 6795.11 6060.78571429 4623.46721311 1728.12121212 4623.46721311 6795.11 6795.11 4623.46721311 1728.12121212 6795.11 1728.12121212 3219.58333333 1728.12121212 6795.11 1249.52 6795.11 2755.17391304 1728.12121212 6795.11 5971.14285714 1728.12121212 4623.46721311 4623.46721311 4676.33333333 6795.11 3936.34375 4883.19047619 3978.5 4623.46721311 4623.46721311 3488.27906977 6795.11 6795.11 3488.27906977 4623.46721311 3936.34375 2600. 3488.27906977 2755.17391304 6795.11 4883.19047619 3488.27906977] </strong></span> |

## Advantages of Decision Trees

- Decision Tree are quite
`useful in Exploratory Data Analysis and Predictive Analysis of data`

and helps find the relationship between the variables. - It implicitly
`selects the best variables`

for building the model. - Decision Tree is almost unaffected by Outlier values.
- Moreover, it is a non-parametric method i.e. it does not hold any change due to the memory distribution or structure of the classifier.

## Limitations of Decision Trees

- Decision Tree may lead to
`Overfitting of data`

i.e. it can create complex tree structures. - Decision Tree may
`create biased trees`

if the data is not balanced.

## Conclusion

By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.

Please find the link to my GitHub profile wherein you would find the entire code along with other machine learning algorithms.