Embeddings Aren’t Magic: The Predictable Failure Modes of RAG Retrieval
RAG systems rely on embeddings that fail predictably: when queries use different terms than docs (e.g., ‘overtime’ vs ‘non-employee labor’), contain negations, or depend on exact IDs/codes, retrieval fails. The article argues enterprise reliability comes from upstream filtering (expert keywords, doc structure), not rerankers atop weak retrieval.
入选理由:嵌入模型在处理同义词/拼写变体时表现优异(如‘cancel’→‘termination procedures’),但对术语不一致问题无能为力
