LLM System Design Interview #24 - Why Backprop Is 3× Harder Than You Think

Why intern engineers underestimate training FLOPs by 300% - and how the dual gradient calculations in backprop make the backward pass twice as expensive as the forward pass.

Nov 22, 2025

∙ Paid

You’re in a Machine Learning Systems interview at Google DeepMind and the interviewer asks:

“You’re asked to budget a training run. An intern engineer estimates the total FLOPs as 2 * num_params * num_tokens, arguing the backward pass is roughly symmetrical to the forward pass. Why is this cost estimate off by 300%, and what two distinct gradient calcula…

Continue reading this post for free, courtesy of Hao Hoang.

Or purchase a paid subscription.

AI Interview Prep

LLM System Design Interview #24 - Why Backprop Is 3× Harder Than You Think

Why intern engineers underestimate training FLOPs by 300% - and how the dual gradient calculations in backprop make the backward pass twice as expensive as the forward pass.

Continue reading this post for free, courtesy of Hao Hoang.