Anthropic says Opus 4.7 hits 80.6% on Document Reasoning — up from 57.1%. But "reasoning about docu...

- Opus 4.7在文档推理基准得分从57.1%提升至80.6%,但不等同于实际解析能力
- 在ParseBench测试中,Opus 4.7对图表识别提升显著(+42.3%),但布局理解反而下降
- LlamaParse Agentic整体达84.9%,成本约1.2¢/页,更适合企业级文档解析场景
But "reasoning about documents" ≠ "parsing documents for agents."
We ran it on ParseBench.
→ Charts: 13.5% → 55.8% (+42.3) — huge → Formatting: 64.2% → 69.4% (+5.2) → Content: 89.7% → 90.3% https://t.co/cyo4QWVsS0" / X
Don’t miss what’s happening
People on X are the first to know.
Post
Conversation

Anthropic says Opus 4.7 hits 80.6% on Document Reasoning — up from 57.1%. But "reasoning about documents" ≠ "parsing documents for agents." We ran it on ParseBench. → Charts: 13.5% → 55.8% (+42.3) — huge → Formatting: 64.2% → 69.4% (+5.2) → Content: 89.7% → 90.3% (+0.6) → Tables: 86.5% → 87.2% (+0.7) → Layout: 16.5% → 14.0% (-2.5) — regressed Real chart gains, but at ~1.5¢/page. Enterprise scale? Not yet. LlamaParse Agentic: 84.9% overall. ~1.2¢/page. The frontier for general document understanding is long. No single model solves it. → github.com/run-llama/Pars

New to X?
Sign up now to get your own personalized timeline!
Trending now
What’s happening
Euphoria · Trending
#euphoria
Sports · Trending
Buffalo
Sports · Trending
Wemby
Sports · Trending
Logan Cooley