𝗬𝗼𝘂 𝗰𝗮𝗻 𝗿𝘂𝗻 𝟮𝟱 𝗺𝗶𝗹𝗹𝗶𝗼𝗻 𝘃𝗲𝗰𝘁𝗼𝗿𝘀 𝗶𝗻 𝗠𝗶𝗹𝘃𝘂𝘀 𝘂𝘀𝗶𝗻𝗴 𝘂𝗻𝗱𝗲𝗿 ...

TL;DR · AI 摘要
Milvus 可在单机 32GB 内存下运行 2500 万 1280 维图像向量,通过 FP16、mmap 和标量过滤技术实现。
核心要点
- 使用 FP16 可将每个向量维度占用内存从 4 字节减少到 2 字节。
- mmap 技术允许 Milvus 通过内存映射文件访问原始向量数据,无需加载全部到内存。
- 标量过滤可将查询范围缩小到几千个向量,显著提升查询效率。
结构提纲
按章节快速跳转。
思维导图
用一张图看清主题之间的关系。
查看大纲文本(无障碍 / 无 JS 友好)
- Milvus 内存优化方案
- FP16 优化
- 减少每个向量维度占用内存
- mmap 技术
- 通过内存映射文件访问原始向量数据
- 标量过滤
- 缩小查询范围,提升查询效率
金句 / Highlights
值得收藏与分享的关键句。
使用 FP16 可将每个向量维度占用内存从 4 字节减少到 2 字节,减少原始向量数据占用空间的一半。
mmap 技术允许 Milvus 通过内存映射文件访问原始向量数据,无需加载全部到内存。
标量过滤可将查询范围缩小到几千个向量,显著提升查询效率。
Milvus on X: "𝗬𝗼𝘂 𝗰𝗮𝗻 𝗿𝘂𝗻 𝟮𝟱 𝗺𝗶𝗹𝗹𝗶𝗼𝗻 𝘃𝗲𝗰𝘁𝗼𝗿𝘀 𝗶𝗻 𝗠𝗶𝗹𝘃𝘂𝘀 𝘂𝘀𝗶𝗻𝗴 𝘂𝗻𝗱𝗲𝗿 𝟭𝗚𝗕 𝗼𝗳 𝗺𝗲𝗺𝗼𝗿𝘆. A user had 25M image vectors, each with 1280 dimensions, and only 32GB of memory available for Milvus on a single machine. The default FP32 sizing estimate https://t.co/HjjOXTSMCl" / X
Milvus
@milvusio
𝗬𝗼𝘂 𝗰𝗮𝗻 𝗿𝘂𝗻 𝟮𝟱 𝗺𝗶𝗹𝗹𝗶𝗼𝗻 𝘃𝗲𝗰𝘁𝗼𝗿𝘀 𝗶𝗻 𝗠𝗶𝗹𝘃𝘂𝘀 𝘂𝘀𝗶𝗻𝗴 𝘂𝗻𝗱𝗲𝗿 𝟭𝗚𝗕 𝗼𝗳 𝗺𝗲𝗺𝗼𝗿𝘆. A user had 25M image vectors, each with 1280 dimensions, and only 32GB of memory available for Milvus on a single machine. The default FP32 sizing estimate
tried more advanced indexes, but neither worked out: • 𝗔𝗜𝗦𝗔𝗤 looked right for constrained hardware, but the build path was too heavy for the machine. • 𝗜𝗩𝗙_𝗙𝗟𝗔𝗧 built successfully, but the collection load hung at 14% and never finished. After working with our developers, the user switched to 𝗙𝗟𝗔𝗧, the simplest index in Milvus. FLAT avoided extra ANN structures and build/load complexity, while Milvus provided the pieces that made the setup practical: • 𝗙𝗣𝟭𝟲 storage cut each vector dimension from 4 bytes to 2 bytes, reducing raw vector data by half. • 𝗺𝗺𝗮𝗽 let Milvus access raw vector data through memory-mapped files instead of loading it all into process memory. • 𝗦𝗰𝗮𝗹𝗮𝗿 𝗳𝗶𝗹𝘁𝗲𝗿𝗶𝗻𝗴 narrowed each query first using fields like dataid and classid, so Milvus compared only a few thousand vectors instead of 25 million. 𝗧𝗵𝗲 𝗿𝗲𝘀𝘂𝗹𝘁: 𝗮𝗿𝗼𝘂𝗻𝗱 𝟲𝟬𝟬𝗠𝗕 𝗼𝗳 𝗿𝗲𝘀𝗶𝗱𝗲𝗻𝘁 𝗺𝗲𝗺𝗼𝗿𝘆 𝗮𝗻𝗱 𝘄𝗮𝗿𝗺 𝗾𝘂𝗲𝗿𝗶𝗲𝘀 𝘂𝗻𝗱𝗲𝗿 𝟭𝟬𝟬𝗺𝘀. When the real search space is much smaller than the full collection, as in multi-tenant RAG, labeled image search, or e-commerce search, FLAT + FP16 + mmap can be a practical option. Full breakdown in the blog:
milvus.io/blog/25-millio…
5:58 PM · Jun 5, 2026
983
Views
2
1
3
13
6
16