公司

IntologyAI

Q: IntologyAI 最近有什么新动态？

traeai 已收录 1 篇与 IntologyAI 相关的内容。最新一篇是「Very interesting results from this NanoGPT-Bench eval. There is so much talk about self-improving a...」，由 elvis(@omarsar0) 发布。

一家专注于AI代理评估的初创公司，发布NanoGPT-Bench基准。

已跟踪 1 条高相关材料

TraeAI 观察

如果只读 3 篇

Very interesting results from this NanoGPT-Bench eval. There is so much talk about self-improving a...

elvis(@omarsar0) · 6.2 分

编码代理在AI研发任务中仅能恢复9.3%的人类进展，主要依赖超参数调优，忽视算法创新，表明当前AI代理尚未具备真实科研能力。

Very interesting results from this NanoGPT-Bench eval.

elvis(@omarsar0)5月20日152 字 (约 1 分钟)

Coding agents recover only 9.3% of human progress in AI research tasks, primarily tuning hyperparameters and ignoring algorithmic innovation, revealing their current inability to conduct real AI R&D.

入选理由：Codex、Claude Code和Autoresearch在NanoGPT-Bench评估中仅恢复9.3%的人类科研进展。

FeaturedTweet#NanoGPT-Bench#Codex#Claude Code#Autoresearch#AI agents英文

跨材料问答 · IntologyAI

回答基于：IntologyAI 相关 1 条材料