loading

Decision Tree — Step-by-Step Visualization

mediumAIMLClassificationDecision Tree

Step through how a Decision Tree learns to classify data — watch it find the best split by minimizing Gini impurity at each node.

Algorithm Pattern

Recursive Binary Splitting on Best Feature Threshold

Key Idea

A Decision Tree greedily selects the split (feature + threshold) that most reduces impurity (Gini or entropy) at each step.

Step-by-Step Approach

  1. Compute the Gini impurity of the current node: 1 - Σ(p_i²) where p_i is the fraction of each class.
  2. For every possible split (feature ≤ threshold), compute the weighted Gini of the two child nodes.
  3. Choose the split with the lowest weighted Gini (highest information gain).
  4. Partition the data into left (≤ threshold) and right (> threshold) subsets.
  5. Recurse on each subset until a stopping condition is met (pure leaf, max depth, min samples).

Common Gotchas

  • Gini impurity of a pure node = 0 (best possible). A 50/50 split has Gini = 0.5 (worst).
  • Decision trees overfit easily — use max_depth or min_samples_leaf to regularize.
  • The greedy split is locally optimal but may not be globally optimal.

Related Problems