Advanced NLP Interview Questions #25 – The Back-Translation Direction Trap
Why generating synthetic sources (not targets) is the only way to preserve decoder fluency in production NMT systems.
You’re in a Senior NLP Engineer interview at Google DeepMind and the interviewer asks:
“We need to improve our 𝘑𝘢𝘱𝘢𝘯𝘦𝘴𝘦-𝘵𝘰-𝘌𝘯𝘨𝘭𝘪𝘴𝘩 translation model. We have 10k parallel pairs and 1 billion lines of monolingual English text. To use 𝐁𝐚𝐜𝐤-𝐓𝐫𝐚𝐧𝐬𝐥𝐚𝐭𝐢𝐨𝐧 effectively, which direction do we generate data, and exactly how do we pa…


