loading

Dropout — Step-by-Step Visualization

easyAIMLRegularizationDeep Learning

Step through dropout regularization — watch neurons randomly deactivated during training, forcing the network to learn redundant representations.

Algorithm Pattern

Random Neuron Masking

Key Idea

Dropout randomly zeros neurons during training forward passes, preventing co-adaptation. It implicitly trains an ensemble of 2^n thinned networks simultaneously.

Step-by-Step Approach

  1. For each neuron, sample Bernoulli(p) — keep with probability p.
  2. Multiply activations element-wise by the mask.
  3. Scale kept neurons by 1/p (inverted dropout) to maintain expected magnitude.
  4. Backward pass: gradient flows only through kept neurons.
  5. At test time: use all neurons, no dropout, no scaling.

Common Gotchas

  • Inverted dropout (scale during training) means test-time code is identical to no-dropout.
  • Dropout is ONLY active during training — never at inference.
  • Too high dropout (>0.5 in deep layers) prevents the network from learning.

Related Problems