Advanced Deep Learning Interview Questions #9 - The Local Minimum Trap
A flat gradient isn’t success, it’s often a saddle where curvature, not slope, determines whether your model is actually stuck.
You’re in a Senior ML Engineer interview at Google DeepMind. The interviewer sets a trap:
“You’re training a massive 50-billion parameter MLP on a cluster of H100 GPUs. Your monitoring tool shows the gradient norm has hit absolute zero, but your loss is still unacceptably high. What just happened?”
95% of candidates walk right into it.
Most candidates imm…


