LlamaIndex 🦙(@llama_index)2026年4月16日

Anthropic says Opus 4.7 hits 80.6% on Document Reasoning — up from 57.1%. But "reasoning about docu...

7.2Score

用这条生成生成视频方案

Anthropic says Opus 4.7 hits 80.6% on Document Reasoning — up from 57.1%.

But "reasoning about docu...

AI 深度提炼

Opus 4.7在文档推理基准得分从57.1%提升至80.6%，但不等同于实际解析能力
在ParseBench测试中，Opus 4.7对图表识别提升显著（+42.3%），但布局理解反而下降
LlamaParse Agentic整体达84.9%，成本约1.2¢/页，更适合企业级文档解析场景

#大模型#文档解析#Anthropic#LlamaIndex#AI评估

打开原文

But "reasoning about documents" ≠ "parsing documents for agents."

We ran it on ParseBench.

→ Charts: 13.5% → 55.8% (+42.3) — huge → Formatting: 64.2% → 69.4% (+5.2) → Content: 89.7% → 90.3% https://t.co/cyo4QWVsS0" / X

Don’t miss what’s happening

People on X are the first to know.

Post

Conversation

![Image 1: Square profile picture](https://x.com/llama_index)

Anthropic says Opus 4.7 hits 80.6% on Document Reasoning — up from 57.1%. But "reasoning about documents" ≠ "parsing documents for agents." We ran it on ParseBench. → Charts: 13.5% → 55.8% (+42.3) — huge → Formatting: 64.2% → 69.4% (+5.2) → Content: 89.7% → 90.3% (+0.6) → Tables: 86.5% → 87.2% (+0.7) → Layout: 16.5% → 14.0% (-2.5) — regressed Real chart gains, but at ~1.5¢/page. Enterprise scale? Not yet. LlamaParse Agentic: 84.9% overall. ~1.2¢/page. The frontier for general document understanding is long. No single model solves it. → github.com/run-llama/Pars

![Image 2: Image](https://x.com/llama_index/status/2044886527352647859/photo/1)

New to X?

Trending now

What’s happening

Euphoria · Trending

#euphoria

Sports · Trending

Buffalo

Sports · Trending

Wemby

Sports · Trending

Logan Cooley