KernelBench Hard 最近有什么新动态？

traeai 已收录 1 篇与 KernelBench Hard 相关的内容。最新一篇是「Read more from @MiniMax_AI:」，由 OpenRouter(@OpenRouterAI) 发布。

论文

KernelBench Hard

Q: 什么是 KernelBench Hard？

评估模型在系统内核级任务中的表现基准。

别名：Kernel Benchmark Hard

评估模型在系统内核级任务中的表现基准。

已跟踪 1 条高相关材料

TraeAI 观察

如果只读 3 篇

MiniMax Launches M3 Open-Weights Model: First to Combine Coding, Agentic, and Long Context Capabilities

OpenRouter(@OpenRouterAI)6月1日82 字 (约 1 分钟)

MiniMax introduces M3, the first open-weight model combining coding, agentic, and long-context capabilities, achieving 59%+ on benchmarks like SWE-Bench Pro with 1M context support, advancing open-source LLMs toward multi-capability frontiers.

入选理由：MiniMax M3 在 SWE-Bench Pro 基准测试中取得 59.0% 正确率，领先多数开源模型。

FeaturedTweet#Open-source model#Large language model#Coding capability#Long context#MiniMax英文

跨材料问答 · KernelBench Hard

回答基于：KernelBench Hard 相关 1 条材料