Classification and Regression Tree (CART) - Regression Tree
  • introduced by Leo Breiman to refer to Decision Tree algorithms that can be used for classification or regression predictive modeling problems

CART - Model Representation

CART Model is a binary tree

CART - Model Learning/Training

the model is built through a process known as binary recursive partitioning, which is an iterative process that splits the data into partitions or branches, and then continues splitting each partition into smaller groups as the method moves to each branch

this involves REPEATING the following processes:

  1. selecting an input variable among all input variables
  2. selecting a split/cut point on that variable

Both are selected by a greedy algorithm that minimizes the cost function (e.g. sum of squared residuals). It is repeated until a predefined stopping criterion is met (e.g. such a minimum number of training instances assigned to each leaf node of the tree)

CART - Model Tree Pruning

cost complexity pruning (similar to Adjusted R-Squared):

  • 𝑇𝑟𝑒𝑒 𝑆𝑐𝑜𝑟𝑒 = 𝑆𝑆𝑅 + 𝛼·𝑇

where:

CART - Resources