Multi-Vector Retrieval Strategy: Separability Determines nDCG@10 Success
Choosing the wrong approximate strategy in multi-vector retrieval causes a 6x drop in nDCG@10, exceeding model upgrade gains. Measure embedding space separability via MaxSim std dev: use TokenANN/MUVERA for high spread, LEMUR for low spread.
入选理由:同模型数据集下,错误近似策略使nDCG@10从0.701跌至0.109,损失超模型升级收益
