AI Interview Prep

AI Interview Prep

Machine Learning System Design Interview #18 - The Semantic Imbalance Trap

Why rotating the same 45 deer won’t save your classifier and how generative synthesis actually fixes class imbalance.

Hao Hoang's avatar
Hao Hoang
Dec 04, 2025
∙ Paid

You’re in a Senior ML Interview at OpenAI. The interviewer sets a trap:

“We have 50 000 images of ‘city streets’ but only 45 images of ‘deer at night.’ How do we fix this 𝐂𝐥𝐚𝐬𝐬 𝐈𝐦𝐛𝐚𝐥𝐚𝐧𝐜𝐞 to prevent the model from ignoring the deer?”

90% of candidates walk right into the trap.

They say “I will ramp up the data augmentation pipeline.” Then they start listing standard 𝘵𝘰𝘳𝘤𝘩𝘷𝘪𝘴𝘪𝘰𝘯 transforms: 𝘙𝘢𝘯𝘥𝘰𝘮𝘏𝘰𝘳𝘪𝘻𝘰𝘯𝘵𝘢𝘭𝘍𝘭𝘪𝘱, 𝘙𝘢𝘯𝘥𝘰𝘮𝘙𝘰𝘵𝘢𝘵𝘪𝘰𝘯(30), 𝘊𝘰𝘭𝘰𝘳𝘑𝘪𝘵𝘵𝘦𝘳, and maybe 𝘔𝘰𝘴𝘢𝘪𝘤 𝘢𝘶𝘨𝘮𝘦𝘯𝘵𝘢𝘵𝘪𝘰𝘯.

It feels like the correct, robust MLOps answer.

The interviewer nods, notes “You just trained the model to recognize those same 45 deer upside-down and slightly greener.” and moves on.

AI Interview Prep is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

User's avatar

Continue reading this post for free, courtesy of Hao Hoang.

Or purchase a paid subscription.
© 2026 Hao Hoang · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture