Claude Pass Rate Below 4%, SaaS-Bench Shatters the 'Fully Automated Office' Illusion of Computer-Use
SaaS-Bench evaluation shows mainstream large models have less than 4% complete pass rate on real office tasks, revealing huge challenges for AI fully automated office work.
入选理由:Claude Opus 4.7在106个真实办公任务中仅完全通过3.8%(4个)













