AI Interview Prep

AI Interview Prep

LLM Agents Interview Questions #10 - The Semantic Leakage Trap

If you think stronger system prompts fix sycophancy, you’ve ignored that softmax attention mathematically forces biased tokens into the reasoning path.

Hao Hoang's avatar
Hao Hoang
Mar 04, 2026
∙ Paid

You’re in a Senior AI Engineer interview at Anthropic and the interviewer asks:

“Your production RAG system is suffering from severe semantic leakage. A user injects a biased, false premise in their prompt ( for example: ‘Since the sun is yellow from space...’), and the LLM blindly agrees, altering its output to match the bias. System prompts and few-shot examples aren’t stopping it. What is fundamentally happening at the attention layer to cause this sycophancy, and how do you architect a fix?”

Most candidates say: “I’d write a stricter system prompt like ‘DO NOT agree with false user premises,’ tweak the temperature, or retrieve more RAG chunks to try and drown out the user’s bias.”

Wrong approach. You’re trying to use a band-aid on a structural hemorrhage.

The reality is that prompt engineering can’t override fundamental architecture.

Here is the breakdown of the problem and the solution:

AI Interview Prep is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Keep reading with a 7-day free trial

Subscribe to AI Interview Prep to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2026 Hao Hoang · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture