Question 1

What is the algorithm pattern for Dropout?

Accepted Answer

Random Neuron Masking: Dropout randomly zeros neurons during training forward passes, preventing co-adaptation. It implicitly trains an ensemble of 2^n thinned networks simultaneously.

Question 2

How do you solve Dropout step by step?

Accepted Answer

For each neuron, sample Bernoulli(p) — keep with probability p. Multiply activations element-wise by the mask. Scale kept neurons by 1/p (inverted dropout) to maintain expected magnitude. Backward pass: gradient flows only through kept neurons. At test time: use all neurons, no dropout, no scaling.

Question 3

What are common mistakes when solving Dropout?

Accepted Answer

Inverted dropout (scale during training) means test-time code is identical to no-dropout. Dropout is ONLY active during training — never at inference. Too high dropout (>0.5 in deep layers) prevents the network from learning.

Dropout — Step-by-Step Visualization

Algorithm Pattern

Key Idea

Step-by-Step Approach

Common Gotchas

Related Problems