Machine Learning System Design Interview #42 - The Base-Rate F1 Trap

Why a phenomenal 0.90 F1-score can quietly mask a completely untrained dummy model, and how to decouple aggregate metrics before they cause a silent production crash.

May 30, 2026

∙ Paid

You’re in a Senior ML Engineer interview at Meta. The interviewer sets a trap:

“An engineer shows you a binary classification model boasting a phenomenal 0.90 F1-score on a newly curated validation set, claiming it’s ready for production deployment. Before even looking at the architecture, you flag this metric as a potential illusion. What hidden data profile characteristic are you suspecting, and how do you prove it?”

95% of candidates walk right into it.

Most candidates say: “A 0.90 F1-score is highly robust against class imbalance, unlike accuracy, so the model is fundamentally solid. To be safe, I’ll just check the confusion matrix, plot the ROC-AUC curve, and tune the classification threshold.”

𝐓𝐡𝐞 𝐑𝐞𝐚𝐥𝐢𝐭𝐲:

They forgot how easily aggregate metrics mask high base-rate skews and Simpson’s Paradox. If your newly curated validation set has an underlying 90% positive class distribution, a completely brainless, untrained dummy model that randomly outputs the positive class 90% of the time will naturally achieve a 0.90 F1-score.

You aren’t looking at a production-ready model; you are looking at a baseline illusion. Relying on global metrics across a macro-level validation set completely blinds you to systemic failures inside critical data slices and minority classes, ensuring a silent crash the moment the model encounters real-world data distributions.

𝐓𝐡𝐞 𝐒𝐨𝐥𝐮𝐭𝐢𝐨𝐧:

Continue reading this post for free, courtesy of Hao Hoang.

Or purchase a paid subscription.

AI Interview Prep

Machine Learning System Design Interview #42 - The Base-Rate F1 Trap

Why a phenomenal 0.90 F1-score can quietly mask a completely untrained dummy model, and how to decouple aggregate metrics before they cause a silent production crash.

Continue reading this post for free, courtesy of Hao Hoang.