Iterative Dichotomiser 3 (ID3)

ID3 - Pseudocode #1

  1. Calculate the entropy/information-gain of every attribute π‘Ž of the data set 𝑆
  2. Partition (β€œsplit”) the set 𝑆 into subsets using the attribute for which the resulting entropy after splitting is minimized; or, equivalently, information gain is maximum.
  3. Make a decision tree node containing that attribute
  4. Recurse on subsets using the remaining attributes

ID3 - Pseudocode #2

function ID3(S, A) {
	if (all of S are labeled 1) return leaf 1
	if (all of S are labeled 0) return leaf 0
	if (A = βˆ…) return leaf with value = label majority in S
	else:
		j = argmax_{i∊A} Gain(S,i)
		T1 = ID3({(x,y) ∊ S : x_j = 1}, A\{j})
		T2 = ID3({(x,y) ∊ S : x_j = 0}, A\{j})
		return T.left(T2).right(T1);
}

ID3 - Implementations of Gain Measure

ID3 - Pruning

GENERIC TREE PRUNING PROCEDURE
inputs:
	function 𝑓(𝑇,π‘š) which is the bound/estimate for the generalization error of a decision tree 𝑇, based on a sample of size π‘š
    tree 𝑇

foreach node 𝑗 in a bottom-up walk on 𝑇 (from leaves to root):
	find 𝑇' which minimies 𝑓(𝑇',π‘š), where 𝑇' is any of the following:
		the current tree after replacing node 𝑗 with a leaf 1
		the current tree after replacing node 𝑗 with a leaf 0
		the current tree after replacing node 𝑗 with its left subtree
		the current tree after replacing node 𝑗 with its right subtree
		the current tree
	let T := T'