M3 与 Opus 代码审计对比：性能持平，成本骤降

AI HOT 精选

AI HOT 精选2026年6月6日

M3 与 Opus 代码审计对比：性能持平，成本骤降

6.8内容质量

TL;DR · AI 摘要

MiniMax M3 在代码审计基准测试中以 0.07 美元的极低成本实现了与 Claude Opus 4.8 相同的 Bug 检出率（13/17），展现出极高的性价比。

核心要点

MiniMax M3 与 Claude Opus 4.8 在相同代码库和 Prompt 下均检出了 17 个预设 Bug 中的 13 个。
MiniMax M3 的单次运行成本仅为 0.07 美元，而 Claude Opus 4.8 为 1.30 美元。
在代码审计这一特定任务上，M3 的成本效益比约为 Claude Opus 的 18.5 倍。

结构提纲

按章节快速跳转。

§测试环境设定
测试采用相同的代码库和相同的 Prompt，并在其中预先植入了 17 个已知 Bug。
·模型性能对比
MiniMax M3 和 Claude Opus 4.8 在 Bug 检出数量上表现一致，均发现了 13 个 Bug。
·成本效益分析
MiniMax M3 的运行成本（$0.07）远低于 Claude Opus 4.8（$1.30），实现了极高的成本压缩。

思维导图

用一张图看清主题之间的关系。

查看大纲文本（无障碍 / 无 JS 友好）

M3 vs Opus 代码审计对比
- 性能表现
  - 检出率: 13/17 (持平)
- 成本对比
  - MiniMax M3: $0.07
  - Claude Opus: $1.30
- 测试变量
  - 相同代码库
  - 相同 Prompt

金句 / Highlights

值得收藏与分享的关键句。

MiniMax M3 以 0.07 美元检出了 13 个 Bug，而 Claude 最便宜的运行成本为 1.30 美元。
— Kilo (@kilocode)
⬇︎ 下载 PNG 𝕏 分享到 X
两者在 17 个 Bug 中均检出了 13 个。
— MiniMax (official)
⬇︎ 下载 PNG 𝕏 分享到 X
采用相同的代码库、相同的 Prompt 以及 17 个预设 Bug 进行测试。
— Kilo (@kilocode)
⬇︎ 下载 PNG 𝕏 分享到 X

#MiniMax M3#Claude Opus#代码审计#LLM Benchmark#成本优化

打开原文

Image 1: Square profile picture

$0.07 for M3, $3.39 for Opus. Both caught 13 of 17 bugs. Really interesting breakdown from

Definitely worth the read

Quote

Image 2: Square profile picture

Kilo

@kilocode

17h

We gave the same code audit to Claude Opus 4.8 and MiniMax M3. Same codebase. Same prompt. 17 known bugs planted in advance. MiniMax M3 caught 13 of them for $0.07. The cheapest Claude run caught the same 13 for $1.30. Here's the breakdown. Image 3: 🧵

Image 4: Black-and-yellow comparison graphic from Kilo Code’s Code Audit Benchmark. The headline reads “MiniMax M3 vs Claude Opus 4.8.” Below, benchmark results show 17 bugs planted, with both models catching 13 bugs. Cost per run is highlighted as $0.07 for MiniMax M3 and $1.30 for Claude Opus 4.8. A note at the bottom reads, “Same codebase. Same prompt. Full breakdown in thread.” The Kilo Code logo appears in the upper-right corner.

read image description