AI Interview Prep

AI Interview Prep

Computer Vision Interview Questions #14 – The Attention vs MLP Responsibility Trap

Why attention handles communication, but MLPs do the real computation in modern vision transformers.

Hao Hoang's avatar
Hao Hoang
Jan 15, 2026
∙ Paid

You’re in a AI Researcher interview at OpenAI and the interviewer asks:

“We know 𝘚𝘦𝘭𝘧-𝘈𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯 handles the context between tokens. So, why do we burn ~60% of our parameter budget on the 𝘗𝘰𝘴𝘪𝘵𝘪𝘰𝘯-𝘸𝘪𝘴𝘦 𝘔𝘓𝘗 𝘭𝘢𝘺𝘦𝘳𝘴? What is the MLP actually doing?”

Most of candidates say: “It adds non-linearity and more parameters so the model can learn complex functions.”

It’s technically true but architecturally lazy. It treats the MLP as generic “muscle” without understanding its specific role in the signal processing pipeline.

To pass the interview, you need to explain the separation of duties between 𝐂𝐨𝐦𝐦𝐮𝐧𝐢𝐜𝐚𝐭𝐢𝐨𝐧 and 𝐂𝐨𝐦𝐩𝐮𝐭𝐚𝐭𝐢𝐨𝐧.

The reality is that 𝘚𝘦𝘭𝘧-𝘈𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯 is just a fancy weighted average. It moves information between tokens, but it doesn’t really process that information.

𝐇𝐞𝐫𝐞 𝐢𝐬 𝐭𝐡𝐞 𝐛𝐫𝐞𝐚𝐤𝐝𝐨𝐰𝐧:

AI Interview Prep is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Keep reading with a 7-day free trial

Subscribe to AI Interview Prep to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2026 Hao Hoang · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture