Machine Learning System Design Interview #45 - The Temporal Blindness Trap
Your batch pipeline isn't wrong, it's frozen in time, paying for compute savings with engagement decay that never shows up on your cloud bill.
You’re in a Senior ML Engineer interview at Netflix and the interviewer asks:
“You’re serving recommendations from a batch pipeline that precomputes results nightly. Engagement is dropping and users say the recs feel stale. Walk me through the real failure mode, and when batch stops being the right call.”
Don’t say: “We should retrain the model more often.”. Too shallow. Retraining cadence isn’t the bottleneck here, and this answer tells the interviewer you’re treating a symptom you haven’t diagnosed.
Here’s the real problem. 👇
Batch prediction isn’t failing because your model is wrong. It’s failing because it’s temporally blind.
You precomputed every user’s recommendations at 2 AM. But a user’s intent isn’t a daily constant, it’s a live signal.
A user binges thrillers all week → your batch job learns “thriller fan”
Tonight they’re suddenly in the mood for a comedy
Your system keeps serving thrillers until the next batch run
You’re not serving recommendations. You’re serving yesterday’s guess about today’s user.
This is the core tradeoff most candidates miss:
a) Batch optimizes for throughput, not freshness. It’s brilliant when predictions are stable (credit risk, churn scoring) and the cost of staleness is near zero.


