Running local models on an M4 with 24GB memory
Hacker News Best1675 字 (约 7 分钟)
85
Running Qwen 3.5-9B (q4_k_s quantized) on an M4 MacBook with 24GB RAM achieves ~40 tokens/sec, supports 128K context and tool use for local development.
入选理由:Qwen 3.5-9B (q4_k_s) 在M4 Mac上以40 tokens/秒速度运行,支持128K上下文和工具使用
FeaturedArticle#LLM#local inference#M4#Qwen#LM Studio英文