AI Interview Prep

AI Interview Prep

Computer Vision Interview Questions #5 – The Dead ReLU Trap

Why lowering the learning rate can't resurrect dead neurons - and how architectural gradient flow actually fixes it.

Hao Hoang's avatar
Hao Hoang
Jan 06, 2026
∙ Paid

You’re in a Senior ML Interview at OpenAI. The interviewer sets a trap.

They show you a TensorBoard graph: 40% of your hidden layer neurons are outputting exactly zero. They have stopped updating entirely.

The question: “How do you fix this?”

90% of candidates walk right into the trap.

They say: “It’s a learning rate issue. I would lower the learning rate to stop the weights from jumping too far.”

This answer is technically “safe,” but it fails the production test. Why? Because lowering the learning rate is preventative, not curative. It doesn’t solve the structural failure that has already occurred.

The reality is they aren’t fighting a hyperparameter issue. They are fighting 𝐓𝐡𝐞 𝐇𝐚𝐫𝐝-𝐙𝐞𝐫𝐨 𝐋𝐨𝐜𝐤𝐨𝐮𝐭.

1️⃣ A large gradient update pushes a neuron’s weights such that w*x + b becomes negative for all inputs in your dataset.

2️⃣ Standard ReLU is max(0, x). If the input is negative, the output is 0.

3️⃣ Crucially: The gradient of 0 is 0.

AI Interview Prep is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

User's avatar

Continue reading this post for free, courtesy of Hao Hoang.

Or purchase a paid subscription.
© 2026 Hao Hoang · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture