"IBM AI Engineering certificate program on Coursera - Machine Learning with Python"
- https://uki.blogspot.com/2022/09/ibmaiengineering.html
- https://uki.blogspot.com/2022/10/classification.html
I am also maintaining a PRIVATE Jupyter notebook on GitHub:
All images, unless otherwise marked, are copyrighted by IBM Developer Skills Network.
Introduction to Decision Trees
https://www.coursera.org/learn/machine-learning-with-python/lecture/gDedK/introduction-to-decision-trees
What are Decision Trees?
It is built by splitting the training set into distinct nodes. One node in a Decision Tree contains all of, or most of, one category of the data.
- internal node - the test
- branch node - the result of the test
- leaf node - the assigned classification
Building Decision Trees
How to create a decision tree?
Use recursive partitioning by using the most predictive feature:
- choose an attribute from the dataset
- calculate the significance of the attribute in splitting data
- split the data based on the value of the best attribute
- go back to step 1
The pure nodes are those that contain the same type of category.
The impurity of nodes is calculated by the entropy of the data.
The entropy is the amount of randomness or uncertainty, the lower the entropy, the less uniform the distribution, and the purer (homogenous) the node. Homogenous has entropy = 0.
Entropy in a node is the amount of information disorder calculated in each node.
Use the frequency table calculated by the entropy formula:
$$ entropy = - p(A) log_2(p(A)) - p(B) log_2(p(B)) $$
Where
Choose the tree with the higher information gain after splitting.
$$ information \ gain = (entropy \ before \ the \ split) - ( weighted \ entropy \ aftersplit) $$