Don't Just Give Agents Tools — They Can't Choose Wisely! Fudan × Tongyi Propose New CUA Training Paradigm
量子位3966 字 (约 16 分钟)
85
Fudan and Tongyi introduce ToolCUA, solving Agent’s inability to select between GUI and Tool actions; achieves 46.85% accuracy on OSWorld-MCP, surpassing Claude-4-Sonnet, via synthetic trajectory generation and trajectory-level reward design.
入选理由:ToolCUA在OSWorld-MCP上达46.85%准确率,超越Claude-4-Sonnet,接近Claude-4.5-Sonnet。
FeaturedArticle#Agent#CUA#Tool Selection#Reinforcement Learning#Open Source中文
