ParseBench is the first benchmark to include VLM chart understanding 📊📈📉 over enterprise document...

- 现有基准如ChartQA仅测试独立图表,忽略图表在完整文档中的上下文语义。
- ParseBench提供568页含多样化嵌入图表的真实企业文档,支持离散/连续序列等复杂类型。
- 每个图表数据点由模型初筛+人工校验,确保评估AI代理提取数值能力的准确性。
🟠 Existing benchmarks (ChartQA, ChartXiv) test over charts specifically and not the chart's inclusion in the overall document. Also doesn't contain references to real-world https://t.co/yL8IoTDn7X" / X
Post
Conversation
ParseBench is the first benchmark to include VLM chart understanding !Image 1: 📊!Image 2: 📈!Image 3: 📉 over enterprise documents. !Image 4: 🟠 Existing benchmarks (ChartQA, ChartXiv) test over charts specifically and not the chart's inclusion in the overall document. Also doesn't contain references to real-world docs !Image 5: ✅ ParseBench contains 568 pages containing a diversity of charts embedded in real-world documents. !Image 6: ✅ It contains a mix of charts: discrete series, continuous series, bar/point/line graphs, charts without clear markers, and more !Image 7: ✅ Each chart has a set of ground-truth datapoints bootstrapped with an initial model and verified through human annotators (with a tolerance) Come check it out! Blog: llamaindex.ai/blog/parsebenc Paper: arxiv.org/abs/2604.08538 Website: parsebench.ai/?utm_medium=so

Quote

LlamaIndex !Image 10: 🦙
@llama_index
9h
Let's talk parsing charts !Image 11: 📊!Image 12: 📈. Last week we released ParseBench, the first document OCR benchmark for AI agents. New in ParseBench: ChartDataPointMatch. Most document look at a chart and OCR the caption. Agents need the actual numbers. That's the gap between "OCR'd the
