Computer Vision Interview Questions #4 - The L1 vs L2 Geometry Trap
Why rotating the feature space instantly exposes candidates who don’t understand metric invariance.
You’re in a Senior Computer Vision interview at Google and the interviewer asks:
“We’re building a similarity search for a new dataset. If I arbitrarily rotate the feature space by 45 degrees, which distance metric falls apart: 𝐋1 𝐨𝐫 𝐋2? And what does that tell you about our feature engineering strategy?”
Most of candidates say: “Well, 𝐋1 (𝐌𝐚𝐧𝐡𝐚𝐭𝐭𝐚𝐧) is good for sparse data like text, and 𝐋2 (𝐄𝐮𝐜𝐥𝐢𝐝𝐞𝐚𝐧) is standard for images. Rotation shouldn’t really change the distances much.”
The reality is that this answer fails the geometry check. If they say this, they have just proven they treat hyperparameters like magic numbers rather than geometric tools.
The key concept here is 𝐂𝐨𝐨𝐫𝐝𝐢𝐧𝐚𝐭𝐞 𝐃𝐞𝐩𝐞𝐧𝐝𝐞𝐧𝐜𝐞. The interviewer is testing if they understand the geometry of their error surface. Here is the breakdown:
1️⃣ 𝐋2 (𝐄𝐮𝐜𝐥𝐢𝐝𝐞𝐚𝐧) 𝐢𝐬 𝐑𝐨𝐭𝐚𝐭𝐢𝐨𝐧𝐚𝐥𝐥𝐲 𝐈𝐧𝐯𝐚𝐫𝐢𝐚𝐧𝐭.
Think of a circle. If you spin a circle, it looks exactly the same. The distance between two points “as the crow flies” doesn’t change just because you tilted your head (or the axes).
2️⃣ 𝐋1 (𝐌𝐚𝐧𝐡𝐚𝐭𝐭𝐚𝐧) 𝐢𝐬 𝐂𝐨𝐨𝐫𝐝𝐢𝐧𝐚𝐭𝐞 𝐃𝐞𝐩𝐞𝐧𝐝𝐞𝐧𝐭.


