Milvus(@milvusio)2026年4月23日

We talked to the authors of 𝗥𝗮𝗕𝗶𝘁𝗤, the people affected by Google's TurboQuant paper. ...

5.0Score

AI 深度提炼

𝗙𝗶𝘃𝗲 𝘁𝗵𝗶𝗻𝗴𝘀 𝘀𝘁𝘂𝗰𝗸 𝘄𝗶𝘁𝗵 𝘂𝘀: • 𝗩𝗲𝗰𝘁𝗼𝗿 𝗾𝘂𝗮𝗻𝘁𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗵𝗮𝘀 𝗵𝗶𝘁 𝗶𝘁𝘀 𝘁𝗵𝗲𝗼𝗿𝗲𝘁𝗶𝗰𝗮𝗹 𝗰𝗲𝗶𝗹𝗶𝗻𝗴. RaBitQ is mathematically proven https://t.co/n426B4ZY6w" / X

Post

Conversation

We talked to the authors of 𝗥𝗮𝗕𝗶𝘁𝗤, the people affected by Google's TurboQuant paper. 𝗙𝗶𝘃𝗲 𝘁𝗵𝗶𝗻𝗴𝘀 𝘀𝘁𝘂𝗰𝗸 𝘄𝗶𝘁𝗵 𝘂𝘀: • 𝗩𝗲𝗰𝘁𝗼𝗿 𝗾𝘂𝗮𝗻𝘁𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗵𝗮𝘀 𝗵𝗶𝘁 𝗶𝘁𝘀 𝘁𝗵𝗲𝗼𝗿𝗲𝘁𝗶𝗰𝗮𝗹 𝗰𝗲𝗶𝗹𝗶𝗻𝗴. RaBitQ is mathematically proven asymptotically optimal. The remaining gains are on the engineering side: hardware, data distribution, latency. • 𝗖𝗼𝗺𝗽𝗿𝗲𝘀𝘀𝗶𝗼𝗻 𝘄𝗼𝗻'𝘁 𝘀𝗵𝗿𝗶𝗻𝗸 𝘀𝘁𝗼𝗿𝗮𝗴𝗲 𝗱𝗲𝗺𝗮𝗻𝗱. 𝗜𝘁 𝗺𝗶𝗴𝗵𝘁 𝗴𝗿𝗼𝘄 𝗶𝘁. Smaller vectors mean larger models run on smaller devices, which creates new workloads instead of replacing old ones. • 𝗦𝗶𝗻𝗰𝗲 𝗥𝗮𝗕𝗶𝘁𝗤, 𝗺𝗮𝘁𝗵𝗲𝗺𝗮𝘁𝗶𝗰𝗮𝗹 𝘃𝗲𝗰𝘁𝗼𝗿 𝗾𝘂𝗮𝗻𝘁𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝘁𝗵𝗿𝗲𝗲 𝘀𝘁𝗲𝗽𝘀. Random rotation (a form of Johnson-Lindenstrauss transformation) to spread information evenly across dimensions, grid construction, then quantization. • 𝗩𝗲𝗰𝘁𝗼𝗿 𝗾𝘂𝗮𝗻𝘁𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝗶𝗻𝗳𝗹𝘂𝗲𝗻𝗰𝗶𝗻𝗴 𝘁𝗿𝗮𝗻𝘀𝗳𝗼𝗿𝗺𝗲𝗿 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲. On the surface, KV cache compression and ANN vector compression are different problems. Mathematically, they share most of the same logic. • 𝗞𝗩 𝗰𝗮𝗰𝗵𝗲 𝗶𝘀 𝗰𝗵𝗲𝗮𝗽 𝘀𝘁𝗼𝗿𝗮𝗴𝗲 𝘁𝗿𝗮𝗱𝗲𝗱 𝗳𝗼𝗿 𝗲𝘅𝗽𝗲𝗻𝘀𝗶𝘃𝗲 𝗰𝗼𝗺𝗽𝘂𝘁𝗲. Quantization makes that trade more favorable on both sides. • 𝗪𝗮𝗻𝘁 𝘁𝗼 𝘀𝗲𝗲 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗲𝗱 𝗥𝗮𝗕𝗶𝘁𝗤 𝗶𝗻 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻? Try 𝗜𝗩𝗙_𝗥𝗔𝗕𝗜𝗧𝗤 𝗶𝗻 𝗠𝗶𝗹𝘃𝘂𝘀 𝟮.𝟲. 𝗡𝗼𝘁𝗲: The views expressed are those of the interviewees and do not represent Zilliz. Views belong to 𝗝𝗶𝗮𝗻𝘆𝗮𝗻𝗴 𝗚𝗮𝗼 (RaBitQ first author), 𝗖𝗵𝗲𝗻𝗴 𝗟𝗼𝗻𝗴 (RaBitQ co-author), and 𝗟𝗶 𝗟𝗶𝘂 (Zilliz Engineering). 𝗙𝘂𝗹𝗹 𝗰𝗼𝗻𝘃𝗲𝗿𝘀𝗮𝘁𝗶𝗼𝗻: milvus.io/blog/interview

![Image 1: Image](https://x.com/milvusio/status/2047339091302433049/photo/1)