Advanced Deep Learning Interview Questions #11 - The Bias-Weight Divergence Trap
When biases update but weights freeze, it exposes a forward-pass collapse that most engineers mistakenly attribute to gradient instability.
You’re in a Senior ML Engineer interview at DeepMind. The interviewer sets a trap:
“During debugging, you notice your biases are updating rapidly, but your weight matrices are completely frozen, despite both sharing the exact same upstream gradient vector from the next layer. Looking at the isolated backprop equations for weight gradients versus bias gr…


