3x Faster Search: Parallel Test-Time Scaling with Instructed-Retriever-1
Databricks' Instructed-Retriever-1 cuts search latency by 3x and TTFT to ~2s via parallel test-time scaling without quality loss. The unified model handles query generation and reranking in parallel using multi-pivot groupwise reranking, achieving Pareto-optimal recall-precision tradeoffs for enterprise RAG systems.
入选理由:Instructed-Retriever-1使搜索延迟降低3倍以上,TTFT降至约2秒,无需重新配置。
