Code Arena | Image-to-WebDev

View overall rankings across AI models on their ability to generate websites from images and screenshots, alongside agentic coding workflows that involve multi-step reasoning and tool use.

May 14, 2026

31,859 votes

23 models

Rank by

	Rank Spread
1	13	claude-opus-4-7-thinking Anthropic · Proprietary	1581+15/-15	2,075	$5 / $25	1M
2	16	claude-sonnet-4-6 Anthropic · Proprietary	1557+13/-13	3,158	$3 / $15	1M
3	16	claude-opus-4-7 Anthropic · Proprietary	1556+14/-14	2,377	$5 / $25	1M
4	28	claude-opus-4-6-thinking Anthropic · Proprietary	1538+13/-13	2,997	$5 / $25	1M
5	28	gpt-5.5-xhigh (codex-harness) OpenAI · Proprietary	1537+15/-15	1,816	N/A	N/A
6	28	claude-opus-4-6 Anthropic · Proprietary	1534+13/-13	3,043	$5 / $25	1M
7	48	kimi-k2.6 Moonshot · Modified MIT	1522+17/-17	1,451	$0.95 / $4	262.1K
8	48	gpt-5.5-high (codex-harness) OpenAI · Proprietary	1519+15/-15	1,965	N/A	N/A
9	911	gemini-3.1-pro-preview Google · Proprietary	1490+12/-12	3,597	$2 / $12	1M
10	911	gpt-5.5 (codex-harness) OpenAI · Proprietary	1489+15/-15	1,935	N/A	N/A
11	915	qwen3.6-plus Alibaba · Proprietary	1467+13/-13	2,602	$0.33 / $1.95	1M
12	1118	gemini-3-pro Google · Proprietary	1453+20/-20	1,091	$2 / $12	1M
13	1117	gemini-3-flash Google · Proprietary	1447+10/-10	4,435	$0.50 / $3	1M
14	1119	gpt-5.3-codex (codex-harness) OpenAI · Proprietary	1441+14/-14	2,506	$1.75 / $14	400K
15	1119	kimi-k2.5-thinking Moonshot · Modified MIT	1440+16/-16	1,740	$0.60 / $3	N/A
16	1219	gpt-5.4 OpenAI · Proprietary	1435+18/-18	1,220	$2.50 / $15	1.1M
17	1420	gemini-3-flash (thinking-minimal) Google · Proprietary	1421+10/-10	4,369	$0.50 / $3	1M
18	1220	gpt-5.1-high OpenAI · Proprietary	1421+20/-20	1,112	$1.25 / $10	400K
19	1320	kimi-k2.5-instant Moonshot · Modified MIT	1415+20/-20	1,093	$0.38 / $2.02	262.1K
20	1720	grok-4.3 xAI · Proprietary	1396+21/-21	965	$1.25 / $2.50	1M
21	2122	gpt-5.1 OpenAI · Proprietary	1344+19/-19	1,264	$1.25 / $10	400K
22	2122	gemini-3.1-flash-lite-preview Google · Proprietary	1329+13/-13	3,742	$0.25 / $1.50	1M
23	2323	gemini-2.5-pro Google · Proprietary	1276+19/-19	1,185	$1.25 / $10	1M

Code Arena | Image-to-WebDev

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Battle Count for Each Combination of Models (without Ties)

Code Arena | Image-to-WebDev

Remove Style Control Leaderboard Plots

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Battle Count for Each Combination of Models (without Ties)