AI Interview Prep

AI Interview Prep

Advanced Reinforcement Learning Interview Questions #16 - The Bootstrapping Bias Trap

Bootstrapping doesn’t just reduce variance, it injects your model’s current errors directly into the label, turning bad initialization into self-reinforcing policy collapse.

Hao Hoang's avatar
Hao Hoang
Feb 11, 2026
∙ Paid

You’re in a Senior RL Engineer interview at OpenAI and the interviewer drops this scenario:

“We accidentally initialized our Value Network to output -1000 for every state. We run one update step using Monte Carlo and one using Bootstrapping (TD-Learning). Which algorithm breaks immediately, and which one survives?”

AI Interview Prep is a reader-supported …

User's avatar

Continue reading this post for free, courtesy of Hao Hoang.

Or purchase a paid subscription.
© 2026 Hao Hoang · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture