T
traeai
Sign in

概念

CritPt

评估大模型在理论物理推理任务上的基准测试集。

相关材料

已收录 1 条与 CritPt 相关的内容,按评分排序。

watching a team of agents tackling a hard theoretical physics problem is quite mesmerizing - self-co...

The Physics-Intern framework boosts Gemini 3.1 Pro's performance on the CritPt benchmark from 17.7% to 31.4% via multi-agent collaboration, setting a new SOTA in theoretical physics reasoning.

入选理由:Physics-Intern 使用多智能体协作框架解决复杂理论物理问题。

FeaturedTweet#AI Agent#Theoretical Physics#LLM Reasoning#Gemini#CritPt中英混合

跨材料问答 · CritPt

回答基于:CritPt 相关 1 条材料
    0 / 500

    AI may generate inaccurate information. Please verify important content.