For frontier AI news
Powered by Code Arena

WebDev Leaderboard

Compare the performance of AI models for web development tasks built in the Code Arena
The legacy WebDev leaderboard is still available at web.lmarena.ai

Last Updated

Dec 4, 2025

Total Votes

50,534

Total Models

24

Rank Spread
(Upper-Lower)
1
1◄─►1
1511+14/-142,323
Anthropic
Proprietary
2
2◄─►3
1476+10/-107,154
Google
Proprietary
3
2◄─►3
1472+14/-142,377
Anthropic
Proprietary
4
4◄─►8
1399+12/-123,943
OpenAI
Proprietary
5
4◄─►8
1398+9/-96,217
Anthropic
Proprietary
6
4◄─►8
1395+11/-113,429
OpenAI
Proprietary
7
4◄─►8
1392+9/-96,028
Anthropic
Proprietary
8
4◄─►8
1387+9/-97,311
Anthropic
Proprietary
9
9◄─►11
1366+10/-105,806
Z.ai
MIT
10
9◄─►12
1354+10/-105,270
OpenAI
Proprietary
11
9◄─►12
1350+10/-105,118
Moonshot
Modified MIT
12
10◄─►12
1341+11/-113,614
OpenAI
Proprietary
13
13◄─►13
Minimax
1316+10/-105,783
MiniMax
Apache 2.0
14
14◄─►16
1293+10/-105,154
DeepSeek AI
MIT
15
14◄─►16
1289+9/-95,972
Alibaba
Apache 2.0
16
14◄─►17
1285+9/-95,992
Anthropic
Proprietary
17
16◄─►18
1264+15/-151,943
KwaiKAT
Proprietary
18
17◄─►19
1252+16/-161,564
OpenAI
Proprietary
19
18◄─►21
1229+13/-132,978
xAI
Proprietary
20
19◄─►21
1213+12/-123,504
Google
Proprietary
21
19◄─►21
1205+19/-191,258
xAI
Proprietary
22
22◄─►23
1153+22/-22943
xAI
Proprietary
23
22◄─►24
1143+21/-211,014
xAI
Proprietary
24
23◄─►24
1103+21/-211,031
Mistral
Proprietary

Remove Style Control Leaderboard Plots

Battle Count for Each Combination of Models (without Ties)

Confidence Intervals on Model Strength (via Bootstrapping)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles