T
traeai
Sign in

公司

Epoch AI

别名:epochai

发布 FrontierCode 的研究团队。

已跟踪 7 条高相关材料

TraeAI 观察

相关材料

已收录 7 条与 Epoch AI 相关的内容,按评分排序。

[AINews] FrontierCode: Benchmarking for Code Quality over Slop

[AINews] FrontierCode: Benchmarking for Code Quality over Slop

Latent Space1922 字 (约 8 分钟)
85

FrontierCode 是一项新的代码质量评估基准,专注于衡量代码是否可合并,而非仅通过单元测试。

入选理由:FrontierCode 由开源维护者耗时 40 多小时构建,旨在评估代码是否可合并。

FeaturedArticle#FrontierCode#代码质量#AI 工程#基准测试英文
How open model ecosystems compound

How open model ecosystems compound

Interconnects AI1141 字 (约 5 分钟)
85

China's open AI ecosystem reduces redundant R&D compute costs, enhancing model development efficiency and sustainability.

入选理由:中国AI生态系统的开放性减少了重复的研发计算成本,使实验室能够持续更长时间。

FeaturedArticle#AI#Machine Learning#Open Source#China中文
AI Dev 26 x SF | Ara Khan: Evals Are Broken Use Them Anyway

AI Dev 26 x SF | Ara Khan: Evals Are Broken — Use Them Anyway

DeepLearning.AI6775 字 (约 28 分钟)
78

AI evals are fundamentally broken—over-reliance on objective metrics misleads—but they remain critical when built, interpreted, and embedded properly in agent workflows.

入选理由:当前主流 eval(如 Epoch AI、OpenAI 的 benchmark)存在‘虚假精确性’,模型分数相近时实际能力差异显著。

FeaturedVideo#AI Evaluation#Agent Systems#Benchmarking#LLM#Engineering Practice英文
Some ideas for what comes next, May 2026

Some ideas for what comes next, May 2026

Interconnects AI1700 字 (约 7 分钟)
75

Author predicts 2026 will be a key year for AI development, with open models facing both challenges and opportunities.

入选理由:2026年将是AI发展的关键一年,开放模型将面临更多挑战和机遇。

FeaturedArticle#AI#OpenAI#Claude Code#Codex中文
Memory has grown to nearly two-thirds of AI chip component costs

AI Chip Component Costs: Memory at 63%

Hacker News Best1217 字 (约 5 分钟)
75

Memory now accounts for 63% of total AI chip component costs, making it the largest single cost driver.

入选理由:AI芯片内存成本达63%,远超其他组件。

FeaturedArticle#AI chip#memory cost#compute efficiency#hardware architecture#data center英文
FrontierMath评测发现致命错误,将更新修正后分数

FrontierMath Evaluation Reveals Fatal Errors, Updated Scores to Follow

AI HOT 精选118 字 (约 1 分钟)
55

FrontierMath evaluation found fatal errors in ~33% of problems; Epoch AI will release corrected dataset with updated scores.

入选理由:FrontierMath Tiers 1-4中约33%的题目被标记为致命错误

FeaturedArticle#AI Evaluation#Math Benchmark#Data Correction#Epoch AI#Model Assessment英文

跨材料问答 · Epoch AI

回答基于:Epoch AI 相关 7 条材料
    0 / 500

    AI may generate inaccurate information. Please verify important content.