𝗬𝗼𝘂 𝗰𝗮𝗻 𝗿𝘂𝗻 𝟮𝟱 𝗺𝗶𝗹𝗹𝗶𝗼𝗻 𝘃𝗲𝗰𝘁𝗼𝗿𝘀 𝗶𝗻 𝗠𝗶𝗹𝘃𝘂𝘀 𝘂𝘀𝗶𝗻𝗴 𝘂𝗻𝗱𝗲𝗿 ...

Milvus(@milvusio)

Milvus(@milvusio)2026年6月5日

𝗬𝗼𝘂 𝗰𝗮𝗻 𝗿𝘂𝗻 𝟮𝟱 𝗺𝗶𝗹𝗹𝗶𝗼𝗻 𝘃𝗲𝗰𝘁𝗼𝗿𝘀 𝗶𝗻 𝗠𝗶𝗹𝘃𝘂𝘀 𝘂𝘀𝗶𝗻𝗴 𝘂𝗻𝗱𝗲𝗿 ...

8.5内容质量

TL;DR · AI 摘要

Milvus 可在单机 32GB 内存下运行 2500 万 1280 维图像向量，通过 FP16、mmap 和标量过滤技术实现。

核心要点

使用 FP16 可将每个向量维度占用内存从 4 字节减少到 2 字节。
mmap 技术允许 Milvus 通过内存映射文件访问原始向量数据，无需加载全部到内存。
标量过滤可将查询范围缩小到几千个向量，显著提升查询效率。

结构提纲

按章节快速跳转。

§引言
介绍用户在单机 32GB 内存下运行 2500 万图像向量的挑战。
·索引尝试与失败
用户尝试使用 AI-SAQ 和 IVF_FLAT 索引，但均未成功。
·解决方案：FLAT 索引
用户最终选择 FLAT 索引，结合 FP16、mmap 和标量过滤技术实现目标。
›FP16 的作用
FP16 将每个向量维度占用内存从 4 字节减少到 2 字节。
›mmap 的作用
mmap 技术允许 Milvus 通过内存映射文件访问原始向量数据。
›标量过滤的作用
标量过滤可将查询范围缩小到几千个向量，显著提升查询效率。

思维导图

用一张图看清主题之间的关系。

查看大纲文本（无障碍 / 无 JS 友好）

Milvus 内存优化方案
- FP16 优化
  - 减少每个向量维度占用内存
- mmap 技术
  - 通过内存映射文件访问原始向量数据
- 标量过滤
  - 缩小查询范围，提升查询效率

金句 / Highlights

值得收藏与分享的关键句。

使用 FP16 可将每个向量维度占用内存从 4 字节减少到 2 字节，减少原始向量数据占用空间的一半。
— 第 3 段
⬇︎ 下载 PNG 𝕏 分享到 X
mmap 技术允许 Milvus 通过内存映射文件访问原始向量数据，无需加载全部到内存。
— 第 3 段
⬇︎ 下载 PNG 𝕏 分享到 X
标量过滤可将查询范围缩小到几千个向量，显著提升查询效率。
— 第 3 段
⬇︎ 下载 PNG 𝕏 分享到 X

#Milvus#向量数据库#FP16#内存优化

打开原文

Milvus on X: "𝗬𝗼𝘂 𝗰𝗮𝗻 𝗿𝘂𝗻 𝟮𝟱 𝗺𝗶𝗹𝗹𝗶𝗼𝗻 𝘃𝗲𝗰𝘁𝗼𝗿𝘀 𝗶𝗻 𝗠𝗶𝗹𝘃𝘂𝘀 𝘂𝘀𝗶𝗻𝗴 𝘂𝗻𝗱𝗲𝗿 𝟭𝗚𝗕 𝗼𝗳 𝗺𝗲𝗺𝗼𝗿𝘆. A user had 25M image vectors, each with 1280 dimensions, and only 32GB of memory available for Milvus on a single machine. The default FP32 sizing estimate https://t.co/HjjOXTSMCl" / X

Milvus

@milvusio

𝗬𝗼𝘂 𝗰𝗮𝗻 𝗿𝘂𝗻 𝟮𝟱 𝗺𝗶𝗹𝗹𝗶𝗼𝗻 𝘃𝗲𝗰𝘁𝗼𝗿𝘀 𝗶𝗻 𝗠𝗶𝗹𝘃𝘂𝘀 𝘂𝘀𝗶𝗻𝗴 𝘂𝗻𝗱𝗲𝗿 𝟭𝗚𝗕 𝗼𝗳 𝗺𝗲𝗺𝗼𝗿𝘆. A user had 25M image vectors, each with 1280 dimensions, and only 32GB of memory available for Milvus on a single machine. The default FP32 sizing estimate

tried more advanced indexes, but neither worked out: • 𝗔𝗜𝗦𝗔𝗤 looked right for constrained hardware, but the build path was too heavy for the machine. • 𝗜𝗩𝗙_𝗙𝗟𝗔𝗧 built successfully, but the collection load hung at 14% and never finished. After working with our developers, the user switched to 𝗙𝗟𝗔𝗧, the simplest index in Milvus. FLAT avoided extra ANN structures and build/load complexity, while Milvus provided the pieces that made the setup practical: • 𝗙𝗣𝟭𝟲 storage cut each vector dimension from 4 bytes to 2 bytes, reducing raw vector data by half. • 𝗺𝗺𝗮𝗽 let Milvus access raw vector data through memory-mapped files instead of loading it all into process memory. • 𝗦𝗰𝗮𝗹𝗮𝗿 𝗳𝗶𝗹𝘁𝗲𝗿𝗶𝗻𝗴 narrowed each query first using fields like dataid and classid, so Milvus compared only a few thousand vectors instead of 25 million. 𝗧𝗵𝗲 𝗿𝗲𝘀𝘂𝗹𝘁: 𝗮𝗿𝗼𝘂𝗻𝗱 𝟲𝟬𝟬𝗠𝗕 𝗼𝗳 𝗿𝗲𝘀𝗶𝗱𝗲𝗻𝘁 𝗺𝗲𝗺𝗼𝗿𝘆 𝗮𝗻𝗱 𝘄𝗮𝗿𝗺 𝗾𝘂𝗲𝗿𝗶𝗲𝘀 𝘂𝗻𝗱𝗲𝗿 𝟭𝟬𝟬𝗺𝘀. When the real search space is much smaller than the full collection, as in multi-tenant RAG, labeled image search, or e-commerce search, FLAT + FP16 + mmap can be a practical option. Full breakdown in the blog:

milvus.io/blog/25-millio…

5:58 PM · Jun 5, 2026

983

Views

2

1

3

13

6

16