Decision Trees - Machine Learning with Python - IBM AI Engineering certificate program on Coursera

NOTE: This is a continuation of the:
"IBM AI Engineering certificate program on Coursera - Machine Learning with Python"



I am also maintaining a PRIVATE Jupyter notebook on GitHub:



Please note that the Mathematic formulas (LaTex script) DO NOT show on the MOBILE phone, to read this post please use the desktop Chrome browser.

All images, unless otherwise marked, are copyrighted by IBM Developer Skills Network.



Introduction to Decision Trees


It is built by splitting the training set into distinct nodes. One node in a Decision Tree contains all of, or most of, one category of the data.





  • internal node - the test
  • branch node - the result of the test
  • leaf node - the assigned classification


Building Decision Trees

How to create a decision tree?

Use recursive partitioning by using the most predictive feature:
  1. choose an attribute from the dataset
  2. calculate the significance of the attribute in splitting data
  3. split the data based on the value of the best attribute
  4. go back to step 1

The pure nodes are those that contain the same type of category.
The impurity of nodes is calculated by the entropy of the data.
The entropy is the amount of randomness or uncertainty, the lower the entropy, the less uniform the distribution, and the purer (homogenous) the node. Homogenous has entropy = 0.

Entropy in a node is the amount of information disorder calculated in each node.

Use the frequency table calculated by the entropy formula:

$$ entropy = - p(A) log_2(p(A)) - p(B) log_2(p(B)) $$

Where
  • p is the proportion or ratio of the category A or B

Which tree has less entropy after splitting?
Choose the tree with the higher information gain after splitting.

$$ information \ gain = (entropy \ before \ the \ split) - ( weighted \ entropy \ aftersplit) $$







As an Amazon Associate I earn from qualifying purchases.

My favorite quotations..


“A man should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects.”  by Robert A. Heinlein

"We are but habits and memories we chose to carry along." ~ Uki D. Lucas


Popular Recent Articles