Most people's vector database defaults to keeping all data in memory. ๐ง๐ต๐ฎ๐ ๐บ๐ถ๐ด๐ต๐ ๐ฏ๐ฒ ...

- ้ป่ฎคๅ จๅ ๅญๅญๅจๅจ1ไบฟ+ๅ้่งๆจกไธๆๆฌ้ซ3-10ๅ
- MMapๆนๆก็จๆฌๅฐ็ฃ็ๆ้ๅ ่ฝฝ๏ผๅปถ่ฟ็จณๅฎไธๅ ๅญๅ ็จ้ไฝ
- ๅๅฑๅญๅจ้ๅๆๅท็ญๆฐๆฎๅบๅ็ๅบๆฏ๏ผๆพ่่็ๅ ๅญๅ็ฃ็
Two ways to stop paying for idle data (both config changes in https://t.co/0Ad3zYktCi" / X
Post
Conversation
Most people's vector database defaults to keeping all data in memory. ๐ง๐ต๐ฎ๐ ๐บ๐ถ๐ด๐ต๐ ๐ฏ๐ฒ ๐ฐ๐ผ๐๐๐ถ๐ป๐ด ๐๐ผ๐ ๐ฏ-๐ญ๐ฌ๐ ๐บ๐ผ๐ฟ๐ฒ ๐๐ต๐ฎ๐ป ๐ถ๐ ๐ป๐ฒ๐ฒ๐ฑ๐ ๐๐ผ (once you scale to production with 100M+ vectors). Two ways to stop paying for idle data (both config changes in Milvus, not redesigns): โข ๐ ๐ ๐ฎ๐ฝ (v2.3+):data on local disk, loaded on demand. Stable latency. Disk must hold the full dataset. โข ๐ง๐ถ๐ฒ๐ฟ๐ฒ๐ฑ ๐๐๐ผ๐ฟ๐ฎ๐ด๐ฒ (v2.6+): hot data cached locally, cold data in S3. Cache misses add 50-200ms. Both need NVMe SSDs (10K+ IOPS). ๐ข๐ป ๐ญ๐ฌ๐ฌ๐ ๐๐ฒ๐ฐ๐๐ผ๐ฟ๐ (๐ณ๐ฒ๐ด-๐ฑ๐ถ๐บ, ๐ณ๐น๐ผ๐ฎ๐๐ฏ๐ฎ): ๐ฃ๐ถ๐ฐ๐ธ ๐ ๐ ๐ฎ๐ฝ ๐ถ๐ณ: โข P99 < 20ms โ data is local, no network fetch, no surprise spikes. ~77-230 GB memory (vs 768 GB default). โข Uniform access โ tiered storage's cache doesn't help if everything gets hit equally ๐ฃ๐ถ๐ฐ๐ธ ๐๐ถ๐ฒ๐ฟ๐ฒ๐ฑ ๐๐๐ผ๐ฟ๐ฎ๐ด๐ฒ ๐ถ๐ณ: โข Cost is the priority โ <77 GB memory (vs 768 GB default). Saves on both memory and disk (70-90% less) โข Clear 80/20 access pattern โ hot data cached, cold data stays cheap in S3 โข 500M+ vectors โ one node's disk can't hold it all Full config walkthrough !Image 1: ๐milvus.io/blog/how-to-cu
The media could not be played.