PlanningBench: Bringing LLMs from “Saying” to “Doing”
Hunyuan(@TXhunyuan)147 字 (约 1 分钟)
50
Tencent and Renmin University of China’s Gaoling School of AI release PlanningBench, an open‑source, scalable, verifiable framework for evaluating and training LLM planning capabilities, featuring 30+ real‑world tasks and automated verification.
入选理由:PlanningBench 提供 30+ 真实规划任务,支持 LLM 规划能力评估。
FeaturedTweet#LLM#Planning#Open Source#Evaluation Framework中文
