Document Arena

View overall rankings across AI models in document analysis and long-content reasoning.

Jun 10, 2026

252,314 votes

29 models

Rank by

	Rank Spread
1	15	claude-opus-4-6 Anthropic · Proprietary	1507±6	32,122	$5 / $25	1M
2	15	claude-opus-4-6-thinking Anthropic · Proprietary	1507±7	20,246	$5 / $25	1M
3	17	claude-opus-4-7-thinking Anthropic · Proprietary	1498±7	13,878	$5 / $25	1M
4	18	claude-opus-4-7 Anthropic · Proprietary	1496±7	14,110	$5 / $25	1M
5	111	claude-fable-5 Anthropic · Proprietary	1495±15	1,461	$10 / $50	1M
6	310	claude-sonnet-4-6 Anthropic · Proprietary	1487±6	49,424	$3 / $15	1M
7	311	gpt-5.5-high OpenAI · Proprietary	1485±7	11,789	$5 / $30	1.1M
8	411	gpt-5.5 OpenAI · Proprietary	1483±7	12,050	$5 / $30	1.1M
9	612	gpt-5.4 OpenAI · Proprietary	1474±7	24,400	$2.50 / $15	1.1M
10	512	claude-opus-4-8-thinking Anthropic · Proprietary	1473±11	3,431	$5 / $25	1M
11	512	claude-opus-4-8 Anthropic · Proprietary	1472±11	3,223	$5 / $25	1M
12	915	claude-opus-4-5-20251101 Anthropic · Proprietary	1461±10	7,987	$5 / $25	200K
13	1217	kimi-k2.6 Moonshot · Modified MIT	1451±8	8,574	$0.95 / $4	262.1K
14	1217	claude-sonnet-4-5-20250929 Anthropic · Proprietary	1449±7	24,162	$3 / $15	200K
15	1223	muse-spark Meta · Proprietary	1442±18	1,081	N/A	N/A
16	1319	gemini-3.1-pro-preview Google · Proprietary	1441±6	38,090	$2 / $12	1M
17	1321	minimax-m3 MiniMax · Proprietary	1438±10	3,636	$0.60 / $2.40	N/A
18	1523	gemini-3-pro Google · Proprietary	1433±9	10,748	$2 / $12	1M
19	1523	kimi-k2.5-thinking Moonshot · Modified MIT	1429±7	16,540	$0.60 / $3	N/A
20	1625	gemma-4-31b Google · Apache 2.0	1424±9	7,970	N/A	N/A
21	1726	gemini-2.5-pro Google · Proprietary	1420±6	24,980	$1.25 / $10	1M
22	1726	claude-haiku-4-5-20251001 Anthropic · Proprietary	1418±6	26,371	$1 / $5	200K
23	1629	glm-5v-turbo Z.ai · Proprietary	1413±15	1,389	$1.20 / $4	202.8K
24	2029	gemini-3-flash Google · Proprietary	1413±9	7,188	$0.50 / $3	1M
25	2029	grok-4.20-beta-0309-reasoning xAI · Proprietary	1410±7	14,105	$2 / $6	2M
26	2129	gpt-5.2-high OpenAI · Proprietary	1405±9	7,096	$1.75 / $14	400K
27	2329	gpt-5.5-instant OpenAI · Proprietary	1402±8	8,539	$5 / $30	1.1M
28	2329	gpt-5.1 OpenAI · Proprietary	1401±9	8,253	$1.25 / $10	400K
29	2329	gpt-5.2 OpenAI · Proprietary	1401±6	28,188	$1.75 / $14	400K

Document Arena

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Document Arena

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Battle Count for Each Combination of Models (without Ties)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)