Text Generation Arena 文本生成模型排行榜
基于 Text Generation Arena 用户匿名投票的最新AI文本生成模型排行榜,涵盖各模型的 Elo 得分、95% 置信区间、投票量、机构与许可证。
榜首模型
Claude Opus 4.6 (thinking)
最高得分
1,502
模型数量
360
数据版本
2026年05月28日
数据来源: LM Arena
关于本排行榜
本排行榜展示了当前最强 AI 大模型在文本生成任务中的综合实力排名。数据来源于 LMArena(前身为 LMSYS Chatbot Arena),这是目前全球最大的 AI 模型众包评测平台。用户在平台上与两个匿名模型同时对话,并投票选出更好的回答——排名完全由真实用户的偏好决定,而非实验室基准测试。
评测方法概要
匿名盲测:用户同时与两个"隐藏身份"的模型对话,根据回答质量投票,排除品牌偏见。
Elo 评分:基于国际象棋领域的 Elo Rating 体系(Bradley-Terry 模型),通过对战结果计算每个模型的实力分数。分数越高,说明模型在真实对话中被用户选中的概率越大。
场景覆盖广泛:涵盖编程、创意写作、数学推理、知识问答、角色扮演等高频真实场景。
DataLearner 在原始数据基础上提供中文解读与深度分析,并将排行榜模型关联至 DataLearner 模型库,方便您一键查看模型详情、API 定价、评测得分等完整信息。
排名总表
| 排名 | 模型名称 | 得分 | 95% CI | 投票数 | 机构 | 许可证 |
|---|---|---|---|---|---|---|
Claude Opus 4.6 (thinking)Anthropic | 1,502 | +/-4 | 34,186 | Anthropic | Proprietary | |
Opus 4.7 (thinking)Anthropic | 1,500 | +/-5 | 19,973 | Anthropic | Proprietary | |
Claude Opus 4.6Anthropic | 1,498 | +/-4 | 36,512 | Anthropic | Proprietary | |
| 4 | Opus 4.7Anthropic | 1,494 | +/-5 | 20,724 | Anthropic | Proprietary |
| 5 | Muse SparkFacebook AI研究实验室 | 1,489 | +/-6 | 12,228 | Facebook AI研究实验室 | Proprietary |
| 6 | Gemini 3.1 Pro PreviewGoogle Deep Mind | 1,487 | +/-4 | 43,742 | Google Deep Mind | Proprietary |
| 7 | Gemini 3.0 Pro (Preview 11-2025)Google Deep Mind | 1,486 | +/-4 | 41,332 | Google Deep Mind | Proprietary |
| 8 | gpt-5.5-highOpenAI | 1,482 | +/-6 | 16,573 | OpenAI | Proprietary |
| 9 | gpt-5.4-highOpenAI | 1,480 | +/-5 | 28,246 | OpenAI | Proprietary |
| 10 | gemini-3.5-flashGoogle | 1,479 | +/-7 | 9,045 | Proprietary | |
| 11 | GPT-5.5OpenAI | 1,476 | +/-6 | 16,852 | OpenAI | Proprietary |
| 12 | gpt-5.2-chat-latest-20260210OpenAI | 1,476 | +/-4 | 32,280 | OpenAI | Proprietary |
| 13 | 1,476 | +/-5 | 24,468 | xAI | Proprietary | |
| 14 | 1,475 | +/-5 | 29,068 | xAI | Proprietary | |
| 15 | qwen3.7-max-previewAlibaba | 1,475 | +/-10 | 3,755 | Alibaba | Proprietary |
| 16 | GLM 5.1智谱AI | 1,474 | +/-6 | 13,957 | 智谱AI | MIT |
| 17 | gpt-5.5-instantOpenAI | 1,474 | +/-5 | 24,925 | OpenAI | Proprietary |
| 18 | Gemini 3.0 FlashGoogle Deep Mind | 1,473 | +/-4 | 30,732 | Google Deep Mind | Proprietary |
| 19 | Claude Opus 4 (thinking-32k)Anthropic | 1,473 | +/-4 | 37,130 | Anthropic | Proprietary |
| 20 | 1,472 | +/-5 | 28,630 | xAI | Proprietary | |
| 21 | ernie-5.1Baidu | 1,470 | +/-6 | 14,675 | Baidu | Proprietary |
| 22 | Claude Sonnet 4.6Anthropic | 1,470 | +/-5 | 27,474 | Anthropic | Proprietary |
| 23 | gpt-5.4OpenAI | 1,469 | +/-5 | 29,672 | OpenAI | Proprietary |
| 24 | Claude Opus 4Anthropic | 1,469 | +/-3 | 66,107 | Anthropic | Proprietary |
| 25 | 1,466 | +/-3 | 63,569 | xAI | Proprietary | |
| 26 | qwen3.5-max-previewAlibaba | 1,466 | +/-5 | 20,212 | Alibaba | Proprietary |
| 27 | mimo-v2.5-proXiaomi | 1,465 | +/-6 | 15,722 | Xiaomi | MIT |
| 28 | kimi-k2.6Moonshot | 1,462 | +/-6 | 15,765 | Moonshot | Modified MIT |
| 29 | Gemini 3.0 Flash (minimal)Google Deep Mind | 1,461 | +/-4 | 52,876 | Google Deep Mind | Proprietary |
| 30 | 1,460 | +/-3 | 65,655 | xAI | Proprietary | |
| 31 | qwen3.6-max-previewAlibaba | 1,459 | +/-9 | 4,648 | Alibaba | Proprietary |
| 32 | deepseek-v4-pro-thinkingDeepSeek | 1,458 | +/-6 | 15,852 | DeepSeek | MIT |
| 33 | GLM-5智谱AI | 1,457 | +/-5 | 21,930 | 智谱AI | MIT |
| 34 | dola-seed-2.0-proBytedance | 1,456 | +/-4 | 37,742 | Bytedance | Proprietary |
| 35 | Claude Sonnet 4.5Anthropic | 1,455 | +/-3 | 76,121 | Anthropic | Proprietary |
| 36 | Claude Sonnet 4.5 (thinking-32k)Anthropic | 1,455 | +/-3 | 77,813 | Anthropic | Proprietary |
| 37 | GPT-5.1 Pro (high)OpenAI | 1,455 | +/-4 | 40,856 | OpenAI | Proprietary |
| 38 | deepseek-v4-proDeepSeek | 1,454 | +/-6 | 16,920 | DeepSeek | MIT |
| 39 | gemma-4-31bGoogle | 1,452 | +/-8 | 5,855 | Apache 2.0 | |
| 40 | gpt-5.4-mini-highOpenAI | 1,451 | +/-5 | 26,397 | OpenAI | Proprietary |
| 41 | Kimi K2 ThinkingMoonshot AI | 1,449 | +/-4 | 36,795 | Moonshot AI | Modified MIT |
| 42 | ERNIE 5.0百度 | 1,449 | +/-7 | 9,752 | 百度 | Proprietary |
| 43 | Opus 4.1 (thinking-16k)Anthropic | 1,449 | +/-3 | 49,833 | Anthropic | Proprietary |
| 44 | gpt-5.3-chat-latestOpenAI | 1,449 | +/-4 | 30,882 | OpenAI | Proprietary |
| 45 | mimo-v2-proXiaomi | 1,448 | +/-5 | 22,638 | Xiaomi | Proprietary |
| 46 | ERNIE 5.0百度 | 1,448 | +/-4 | 34,159 | 百度 | Proprietary |
| 47 | Opus 4.1Anthropic | 1,447 | +/-3 | 77,373 | Anthropic | Proprietary |
| 48 | 1,447 | +/-6 | 15,773 | xAI | Proprietary | |
| 49 | Gemini 2.5 Pro Experimental 03-25Google Deep Mind | 1,446 | +/-3 | 122,636 | Google Deep Mind | Proprietary |
| 50 | Qwen3.5-397B-A17B阿里巴巴 | 1,445 | +/-4 | 31,970 | 阿里巴巴 | Apache 2.0 |
| 51 | GPT-4.5OpenAI | 1,445 | +/-6 | 14,547 | OpenAI | Proprietary |
| 52 | qwen3.6-plusAlibaba | 1,444 | +/-5 | 18,202 | Alibaba | Proprietary |
| 53 | chatgpt-4o-latest-20250326OpenAI | 1,443 | +/-3 | 82,471 | OpenAI | Proprietary |
| 54 | GLM-4.7智谱AI | 1,443 | +/-6 | 12,133 | 智谱AI | MIT |
| 55 | GPT-5.1 InstantOpenAI | 1,439 | +/-4 | 43,501 | OpenAI | Proprietary |
| 56 | gemma-4-26b-a4bGoogle | 1,439 | +/-8 | 5,789 | Apache 2.0 | |
| 57 | GPT-5.2 Pro (high)OpenAI | 1,438 | +/-4 | 46,111 | OpenAI | Proprietary |
| 58 | deepseek-v4-flash-thinkingDeepSeek | 1,437 | +/-6 | 16,545 | DeepSeek | MIT |
| 59 | longcat-flash-chat-2602-expMeituan | 1,436 | +/-5 | 23,731 | Meituan | Proprietary |
| 60 | Qwen3 Max (Preview)阿里巴巴 | 1,435 | +/-5 | 27,736 | 阿里巴巴 | Proprietary |
| 61 | GPT-5.2OpenAI | 1,435 | +/-4 | 46,492 | OpenAI | Proprietary |
| 62 | mimo-v2.5Xiaomi | 1,434 | +/-6 | 15,979 | Xiaomi | MIT |
| 63 | GPT-5-Pro (high)OpenAI | 1,434 | +/-5 | 31,947 | OpenAI | Proprietary |
| 64 | gemini-3.1-flash-lite-previewGoogle | 1,433 | +/-4 | 35,135 | Proprietary | |
| 65 | deepseek-v4-flashDeepSeek | 1,433 | +/-6 | 16,725 | DeepSeek | MIT |
| 66 | kimi-k2.5-instantMoonshot | 1,432 | +/-7 | 8,197 | Moonshot | Modified MIT |
| 67 | OpenAI o3OpenAI | 1,431 | +/-4 | 59,775 | OpenAI | Proprietary |
| 68 | 1,431 | +/-3 | 54,616 | xAI | Proprietary | |
| 69 | kimi-k2-thinking-turboMoonshot | 1,430 | +/-3 | 60,235 | Moonshot | Modified MIT |
| 70 | amazon-nova-experimental-chat-26-02-10Amazon | 1,427 | +/-10 | 3,418 | Amazon | Proprietary |
| 71 | GPT-5OpenAI | 1,427 | +/-4 | 31,595 | OpenAI | Proprietary |
| 72 | GLM-4.6智谱AI | 1,426 | +/-4 | 35,661 | 智谱AI | MIT |
| 73 | DeepSeek V3.2-Exp (thinking)DeepSeek-AI | 1,425 | +/-7 | 9,064 | DeepSeek-AI | MIT |
| 74 | DeepSeek V3.2DeepSeek-AI | 1,424 | +/-4 | 46,204 | DeepSeek-AI | MIT |
| 75 | Claude Opus 4 (thinking-16k)Anthropic | 1,424 | +/-4 | 36,900 | Anthropic | Proprietary |
| 76 | qwen3-max-2025-09-23Alibaba | 1,424 | +/-6 | 9,158 | Alibaba | Proprietary |
| 77 | DeepSeek V3.2-ExpDeepSeek-AI | 1,423 | +/-6 | 11,941 | DeepSeek-AI | MIT |
| 78 | Qwen3-235B-A22B-2507阿里巴巴 | 1,423 | +/-3 | 95,473 | 阿里巴巴 | Apache 2.0 |
| 79 | DeepSeek-R1-0528DeepSeek-AI | 1,422 | +/-6 | 18,467 | DeepSeek-AI | MIT |
| 80 | DeepSeek V3.2 (thinking)DeepSeek-AI | 1,422 | +/-4 | 40,111 | DeepSeek-AI | MIT |
| 81 | 1,421 | +/-8 | 6,820 | xAI | Proprietary | |
| 82 | ERNIE 5.0百度 | 1,420 | +/-9 | 4,708 | 百度 | Proprietary |
| 83 | kimi-k2-0905-previewMoonshot | 1,418 | +/-6 | 11,795 | Moonshot | Modified MIT |
| 84 | DeepSeek-V3.1DeepSeek-AI | 1,418 | +/-6 | 14,969 | DeepSeek-AI | MIT |
| 85 | deepseek-v3.1-terminus-thinkingDeepSeek | 1,418 | +/-10 | 3,468 | DeepSeek | MIT |
| 86 | Kimi K2Moonshot AI | 1,417 | +/-5 | 27,643 | Moonshot AI | Modified MIT |
| 87 | qwen3.5-122b-a10bAlibaba | 1,417 | +/-4 | 26,670 | Alibaba | Apache 2.0 |
| 88 | DeepSeek-V3.1 (thinking)DeepSeek-AI | 1,417 | +/-7 | 11,746 | DeepSeek-AI | MIT |
| 89 | DeepSeek-V3.1 TerminusDeepSeek-AI | 1,416 | +/-10 | 3,705 | DeepSeek-AI | MIT |
| 90 | amazon-nova-experimental-chat-26-01-10Amazon | 1,416 | +/-10 | 3,414 | Amazon | Proprietary |
| 91 | hunyuan-hy3-previewTencent | 1,416 | +/-8 | 5,812 | Tencent | tencent-hunyuan-community |
| 92 | Qwen3-VL-235B-A22B-Instruct阿里巴巴 | 1,415 | +/-6 | 11,515 | 阿里巴巴 | Apache 2.0 |
| 93 | Mistral Large 3MistralAI | 1,415 | +/-4 | 42,553 | MistralAI | Apache 2.0 |
| 94 | mimo-v2-omniXiaomi | 1,414 | +/-11 | 2,968 | Xiaomi | Proprietary |
| 95 | gpt-4.1-2025-04-14OpenAI | 1,413 | +/-4 | 50,997 | OpenAI | Proprietary |
| 96 | 1,413 | +/-5 | 23,278 | MiniMaxAI | Modified MIT | |
| 97 | Claude Opus 4Anthropic | 1,412 | +/-4 | 44,223 | Anthropic | Proprietary |
| 98 | 1,412 | +/-4 | 32,909 | xAI | Proprietary | |
| 99 | GLM-4.5智谱AI | 1,411 | +/-5 | 24,322 | 智谱AI | MIT |
| 100 | Gemini 2.5 FlashGoogle Deep Mind | 1,411 | +/-3 | 122,458 | Google Deep Mind | Proprietary |
| 101 | claude-haiku-4-5-20251001Anthropic | 1,411 | +/-3 | 78,134 | Anthropic | Proprietary |
| 102 | Magistral-Medium-2506MistralAI | 1,410 | +/-3 | 92,031 | MistralAI | Proprietary |
| 103 | 1,410 | +/-4 | 41,413 | xAI | Proprietary | |
| 104 | qwen3.5-27bAlibaba | 1,408 | +/-5 | 25,772 | Alibaba | Apache 2.0 |
| 105 | gemini-2.5-flash-preview-09-2025Google | 1,405 | +/-4 | 32,925 | Proprietary | |
| 106 | 1,404 | +/-5 | 18,729 | xAI | Proprietary | |
| 107 | qwen3-235b-a22b-no-thinkingAlibaba | 1,403 | +/-5 | 38,226 | Alibaba | Apache 2.0 |
| 108 | gpt-5.4-nano-highOpenAI | 1,403 | +/-5 | 25,617 | OpenAI | Proprietary |
| 109 | qwen3-next-80b-a3b-instructAlibaba | 1,402 | +/-5 | 22,881 | Alibaba | Apache 2.0 |
| 110 | o1-2024-12-17OpenAI | 1,402 | +/-4 | 27,807 | OpenAI | Proprietary |
| 111 | longcat-flash-chatMeituan | 1,401 | +/-6 | 11,405 | Meituan | MIT |
| 112 | qwen3-235b-a22b-thinking-2507Alibaba | 1,400 | +/-7 | 8,993 | Alibaba | Apache 2.0 |
| 113 | Claude Sonnet 4 (thinking-32k)Anthropic | 1,399 | +/-4 | 35,127 | Anthropic | Proprietary |
| 114 | DeepSeek-R1DeepSeek-AI | 1,398 | +/-5 | 18,524 | DeepSeek-AI | MIT |
| 115 | qwen3.5-35b-a3bAlibaba | 1,396 | +/-4 | 27,304 | Alibaba | Apache 2.0 |
| 116 | qwen3.5-flashAlibaba | 1,396 | +/-4 | 29,647 | Alibaba | Proprietary |
| 117 | qwen3-vl-235b-a22b-thinkingAlibaba | 1,396 | +/-7 | 7,947 | Alibaba | Apache 2.0 |
| 118 | hunyuan-vision-1.5-thinkingTencent | 1,396 | +/-12 | 2,220 | Tencent | Proprietary |
| 119 | DeepSeek-V3-0324DeepSeek-AI | 1,395 | +/-4 | 45,518 | DeepSeek-AI | MIT |
| 120 | amazon-nova-experimental-chat-12-10Amazon | 1,395 | +/-10 | 3,681 | Amazon | Proprietary |
| 121 | Step 3.5 FlashStepFunAI | 1,394 | +/-4 | 34,466 | StepFunAI | Apache 2.0 |
| 122 | mimo-v2-flash (non-thinking)Xiaomi | 1,393 | +/-4 | 44,619 | Xiaomi | MIT |
| 123 | 1,391 | +/-4 | 36,265 | MiniMaxAI | Modified MIT | |
| 124 | gpt-5-mini-highOpenAI | 1,390 | +/-5 | 27,039 | OpenAI | Proprietary |
| 125 | o4-mini-2025-04-16OpenAI | 1,390 | +/-4 | 45,452 | OpenAI | Proprietary |
| 126 | Claude Sonnet 4Anthropic | 1,389 | +/-4 | 40,323 | Anthropic | Proprietary |
| 127 | o1-previewOpenAI | 1,388 | +/-5 | 31,122 | OpenAI | Proprietary |
| 128 | qwen3-coder-480b-a35b-instructAlibaba | 1,388 | +/-5 | 25,741 | Alibaba | Apache 2.0 |
| 129 | hunyuan-t1-20250711Tencent | 1,387 | +/-9 | 4,711 | Tencent | Proprietary |
| 130 | mimo-v2-flash (thinking)Xiaomi | 1,387 | +/-6 | 10,974 | Xiaomi | MIT |
| 131 | Claude Sonnet 3.7 (thinking-32k)Anthropic | 1,387 | +/-4 | 38,827 | Anthropic | Proprietary |
| 132 | mistral-medium-2505Mistral | 1,387 | +/-5 | 33,230 | Mistral | Proprietary |
| 133 | minimax-m2.1-previewMiniMax | 1,385 | +/-5 | 17,138 | MiniMax | MIT |
| 134 | qwen3-30b-a3b-instruct-2507Alibaba | 1,384 | +/-5 | 23,746 | Alibaba | Apache 2.0 |
| 135 | gpt-4.1-mini-2025-04-14OpenAI | 1,382 | +/-4 | 39,339 | OpenAI | Proprietary |
| 136 | hunyuan-turbos-20250416Tencent | 1,382 | +/-6 | 10,725 | Tencent | Proprietary |
| 137 | gemini-2.5-flash-lite-preview-09-2025-no-thinkingGoogle | 1,380 | +/-3 | 47,246 | Proprietary | |
| 138 | trinity-large-previewArcee AI | 1,378 | +/-4 | 28,284 | Arcee AI | Apache 2.0 |
| 139 | GLM-4.6V智谱AI | 1,378 | +/-11 | 2,808 | 智谱AI | MIT |
| 140 | qwen3-235b-a22bAlibaba | 1,375 | +/-5 | 26,268 | Alibaba | Apache 2.0 |
| 141 | gemini-2.5-flash-lite-preview-06-17-thinkingGoogle | 1,375 | +/-5 | 32,907 | Proprietary | |
| 142 | qwen2.5-maxAlibaba | 1,374 | +/-4 | 32,623 | Alibaba | Proprietary |
| 143 | glm-4.5-airZ.ai | 1,373 | +/-4 | 31,095 | Z.ai | MIT |
| 144 | claude-3-5-sonnet-20241022Anthropic | 1,372 | +/-3 | 88,350 | Anthropic | Proprietary |
| 145 | trinity-large-thinkingArcee AI | 1,371 | +/-5 | 23,918 | Arcee AI | Apache 2.0 |
| 146 | Claude Sonnet 3.7Anthropic | 1,371 | +/-4 | 43,194 | Anthropic | Proprietary |
| 147 | qwen3-next-80b-a3b-thinkingAlibaba | 1,370 | +/-6 | 13,700 | Alibaba | Apache 2.0 |
| 148 | glm-4.7-flashZ.ai | 1,368 | +/-6 | 11,736 | Z.ai | MIT |
| 149 | amazon-nova-experimental-chat-11-10Amazon | 1,367 | +/-4 | 25,407 | Amazon | Proprietary |
| 150 | gemma-3-27b-itGoogle | 1,366 | +/-4 | 47,545 | Gemma | |
| 151 | minimax-m1MiniMax | 1,364 | +/-4 | 35,214 | MiniMax | Apache 2.0 |
| 152 | o3-mini-highOpenAI | 1,363 | +/-5 | 18,589 | OpenAI | Proprietary |
| 153 | 1,362 | +/-5 | 16,968 | xAI | Proprietary | |
| 154 | nvidia-nemotron-3-super-120b-a12bNvidia | 1,361 | +/-7 | 7,458 | Nvidia | NVIDIA Open Model |
| 155 | gemini-2.0-flash-001Google | 1,360 | +/-4 | 43,762 | Proprietary | |
| 156 | deepseek-v3DeepSeek | 1,358 | +/-5 | 21,770 | DeepSeek | DeepSeek |
| 157 | mistral-small-2506Mistral | 1,357 | +/-5 | 17,712 | Mistral | Apache 2.0 |
| 158 | 1,357 | +/-5 | 22,724 | xAI | Proprietary | |
| 159 | intellect-3Prime Intellect | 1,356 | +/-8 | 5,329 | Prime Intellect | MIT |
| 160 | command-a-03-2025Cohere | 1,354 | +/-3 | 56,283 | Cohere | CC-BY-NC-4.0 |
| 161 | glm-4.5vZ.ai | 1,353 | +/-8 | 4,958 | Z.ai | MIT |
| 162 | gemini-2.0-flash-lite-preview-02-05Google | 1,353 | +/-4 | 24,955 | Proprietary | |
| 163 | gpt-oss-120bOpenAI | 1,353 | +/-4 | 30,639 | OpenAI | Apache 2.0 |
| 164 | gemini-1.5-pro-002Google | 1,351 | +/-3 | 55,606 | Proprietary | |
| 165 | amazon-nova-experimental-chat-10-20Amazon | 1,350 | +/-6 | 11,474 | Amazon | Proprietary |
| 166 | hunyuan-turbos-20250226Tencent | 1,349 | +/-12 | 2,220 | Tencent | Proprietary |
| 167 | step-3StepFun | 1,348 | +/-7 | 6,545 | StepFun | Apache 2.0 |
| 168 | amazon-nova-experimental-chat-10-09Amazon | 1,348 | +/-11 | 2,839 | Amazon | Proprietary |
| 169 | o3-miniOpenAI | 1,348 | +/-4 | 57,344 | OpenAI | Proprietary |
| 170 | llama-3.1-nemotron-ultra-253b-v1Nvidia | 1,347 | +/-12 | 2,549 | Nvidia | Nvidia Open Model |
| 171 | qwen3-32bAlibaba | 1,347 | +/-9 | 3,926 | Alibaba | Apache 2.0 |
| 172 | mercury-2Inception AI | 1,347 | +/-11 | 3,123 | Inception AI | Proprietary |
| 173 | ling-flash-2.0InclusionAI | 1,346 | +/-7 | 7,010 | InclusionAI | MIT |
| 174 | minimax-m2MiniMax | 1,346 | +/-8 | 6,875 | MiniMax | Apache 2.0 |
| 175 | qwen-plus-0125Alibaba | 1,346 | +/-8 | 5,819 | Alibaba | Proprietary |
| 176 | gpt-4o-2024-05-13OpenAI | 1,346 | +/-3 | 112,881 | OpenAI | Proprietary |
| 177 | nvidia-llama-3.3-nemotron-super-49b-v1.5Nvidia | 1,343 | +/-10 | 3,345 | Nvidia | Nvidia Open |
| 178 | glm-4-plus-0111Zhipu | 1,343 | +/-8 | 5,760 | Zhipu | Proprietary |
| 179 | claude-3-5-sonnet-20240620Anthropic | 1,342 | +/-3 | 82,419 | Anthropic | Proprietary |
| 180 | gemma-3-12b-itGoogle | 1,342 | +/-10 | 3,829 | Gemma | |
| 181 | hunyuan-turbo-0110Tencent | 1,341 | +/-12 | 2,290 | Tencent | Proprietary |
| 182 | nova-2-liteAmazon | 1,338 | +/-6 | 12,246 | Amazon | Proprietary |
| 183 | gpt-5-nano-highOpenAI | 1,337 | +/-7 | 8,270 | OpenAI | Proprietary |
| 184 | o1-miniOpenAI | 1,337 | +/-4 | 51,981 | OpenAI | Proprietary |
| 185 | qwq-32bAlibaba | 1,336 | +/-4 | 25,402 | Alibaba | Apache 2.0 |
| 186 | 1,335 | +/-4 | 63,498 | xAI | Proprietary | |
| 187 | gemini-advanced-0514Google | 1,335 | +/-5 | 50,148 | Proprietary | |
| 188 | gpt-4o-2024-08-06OpenAI | 1,335 | +/-4 | 45,499 | OpenAI | Proprietary |
| 189 | llama-3.1-405b-instruct-bf16Meta | 1,335 | +/-4 | 41,375 | Meta | Llama 3.1 Community |
| 190 | step-2-16k-exp-202412StepFun | 1,334 | +/-9 | 4,833 | StepFun | Proprietary |
| 191 | llama-3.1-405b-instruct-fp8Meta | 1,333 | +/-4 | 59,656 | Meta | Llama 3.1 Community |
| 192 | olmo-3.1-32b-instructAi2 | 1,330 | +/-6 | 12,225 | Ai2 | Apache 2.0 |
| 193 | yi-lightning01 AI | 1,328 | +/-5 | 27,332 | 01 AI | Proprietary |
| 194 | molmo-2-8bAi2 | 1,328 | +/-21 | 805 | Ai2 | Apache 2.0 |
| 195 | llama-3.3-nemotron-49b-super-v1Nvidia | 1,328 | +/-12 | 2,218 | Nvidia | Nvidia |
| 196 | qwen3-30b-a3bAlibaba | 1,327 | +/-5 | 26,495 | Alibaba | Apache 2.0 |
| 197 | llama-4-maverick-17b-128e-instructMeta | 1,327 | +/-4 | 39,987 | Meta | Llama 4 |
| 198 | hunyuan-large-2025-02-10Tencent | 1,326 | +/-10 | 3,738 | Tencent | Proprietary |
| 199 | gpt-4-turbo-2024-04-09OpenAI | 1,324 | +/-4 | 98,114 | OpenAI | Proprietary |
| 200 | deepseek-v2.5-1210DeepSeek | 1,323 | +/-8 | 6,795 | DeepSeek | DeepSeek |
| 201 | claude-3-5-haiku-20241022Anthropic | 1,323 | +/-3 | 69,993 | Anthropic | Proprietary |
| 202 | gemini-1.5-pro-001Google | 1,323 | +/-4 | 79,138 | Proprietary | |
| 203 | llama-4-scout-17b-16e-instructMeta | 1,323 | +/-5 | 30,299 | Meta | Llama |
| 204 | gpt-4.1-nano-2025-04-14OpenAI | 1,322 | +/-8 | 6,103 | OpenAI | Proprietary |
| 205 | Claude3-OpusAnthropic | 1,321 | +/-3 | 194,909 | Anthropic | Proprietary |
| 206 | ring-flash-2.0InclusionAI | 1,321 | +/-7 | 7,148 | InclusionAI | MIT |
| 207 | step-1o-turbo-202506StepFun | 1,320 | +/-7 | 9,038 | StepFun | Proprietary |
| 208 | glm-4-plusZhipu AI | 1,319 | +/-5 | 26,126 | Zhipu AI | Proprietary |
| 209 | llama-3.3-70b-instructMeta | 1,318 | +/-3 | 54,745 | Meta | Llama-3.3 |
| 210 | gemma-3n-e4b-itGoogle | 1,318 | +/-5 | 22,600 | Gemma | |
| 211 | qwen-max-0919Alibaba | 1,318 | +/-6 | 16,478 | Alibaba | Qwen |
| 212 | gpt-oss-20bOpenAI | 1,318 | +/-6 | 10,633 | OpenAI | Apache 2.0 |
| 213 | gpt-4o-mini-2024-07-18OpenAI | 1,318 | +/-4 | 68,709 | OpenAI | Proprietary |
| 214 | nvidia-nemotron-3-nano-30b-a3b-bf16Nvidia | 1,317 | +/-6 | 15,513 | Nvidia | NVIDIA Open Model |
| 215 | qwen2.5-plus-1127Alibaba | 1,315 | +/-6 | 10,187 | Alibaba | Proprietary |
| 216 | athene-v2-chatNexusFlow | 1,314 | +/-5 | 24,739 | NexusFlow | NexusFlow |
| 217 | mistral-large-2407Mistral | 1,314 | +/-4 | 45,459 | Mistral | Mistral Research |
| 218 | gpt-4-0125-previewOpenAI | 1,313 | +/-4 | 93,439 | OpenAI | Proprietary |
| 219 | granite-4.1-8bIBM | 1,312 | +/-10 | 3,614 | IBM | Apache 2.0 |
| 220 | gpt-4-1106-previewOpenAI | 1,312 | +/-4 | 100,105 | OpenAI | Proprietary |
| 221 | hunyuan-standard-2025-02-10Tencent | 1,311 | +/-10 | 3,904 | Tencent | Proprietary |
| 222 | gemini-1.5-flash-002Google | 1,309 | +/-4 | 34,902 | Proprietary | |
| 223 | 1,308 | +/-4 | 52,567 | xAI | Proprietary | |
| 224 | deepseek-v2.5DeepSeek | 1,307 | +/-5 | 24,572 | DeepSeek | DeepSeek |
| 225 | mercuryInception AI | 1,306 | +/-14 | 1,957 | Inception AI | Proprietary |
| 226 | athene-70b-0725NexusFlow | 1,306 | +/-6 | 19,621 | NexusFlow | CC-BY-NC-4.0 |
| 227 | olmo-3-32b-thinkAi2 | 1,305 | +/-8 | 5,947 | Ai2 | Apache 2.0 |
| 228 | mistral-large-2411Mistral | 1,305 | +/-4 | 28,073 | Mistral | MRL |
| 229 | magistral-medium-2506Mistral | 1,304 | +/-6 | 11,641 | Mistral | Proprietary |
| 230 | mistral-small-3.1-24b-instruct-2503Mistral | 1,303 | +/-5 | 33,220 | Mistral | Apache 2.0 |
| 231 | gemma-3-4b-itGoogle | 1,303 | +/-9 | 4,171 | Gemma | |
| 232 | qwen2.5-72b-instructAlibaba | 1,303 | +/-4 | 39,406 | Alibaba | Qwen |
| 233 | llama-3.1-nemotron-70b-instructNvidia | 1,299 | +/-8 | 7,140 | Nvidia | Llama 3.1 |
| 234 | hunyuan-large-visionTencent | 1,294 | +/-9 | 5,374 | Tencent | Proprietary |
| 235 | llama-3.1-70b-instructMeta | 1,293 | +/-4 | 55,240 | Meta | Llama 3.1 Community |
| 236 | amazon-nova-pro-v1.0Amazon | 1,290 | +/-5 | 24,745 | Amazon | Proprietary |
| 237 | jamba-1.5-largeAI21 Labs | 1,289 | +/-7 | 8,662 | AI21 Labs | Jamba Open |
| 238 | gemma-2-27b-itGoogle | 1,288 | +/-3 | 75,754 | Gemma license | |
| 239 | reka-core-20240904Reka AI | 1,288 | +/-7 | 7,312 | Reka AI | Proprietary |
| 240 | ibm-granite-h-smallIBM | 1,287 | +/-8 | 5,677 | IBM | Apache 2.0 |
| 241 | gpt-4-0314OpenAI | 1,286 | +/-5 | 54,173 | OpenAI | Proprietary |
| 242 | llama-3.1-tulu-3-70bAi2 | 1,286 | +/-10 | 2,846 | Ai2 | Llama 3.1 |
| 243 | gemini-1.5-flash-001Google | 1,286 | +/-5 | 62,833 | Proprietary | |
| 244 | llama-3.1-nemotron-51b-instructNvidia | 1,286 | +/-10 | 3,749 | Nvidia | Llama 3.1 |
| 245 | olmo-3.1-32b-thinkAi2 | 1,285 | +/-7 | 8,505 | Ai2 | Apache 2.0 |
| 246 | claude-3-sonnet-20240229Anthropic | 1,280 | +/-4 | 109,284 | Anthropic | Proprietary |
| 247 | gemma-2-9b-it-simpoPrinceton | 1,279 | +/-7 | 10,072 | Princeton | MIT |
| 248 | nemotron-4-340b-instructNvidia | 1,276 | +/-5 | 19,659 | Nvidia | NVIDIA Open Model |
| 249 | command-r-plus-08-2024Cohere | 1,276 | +/-7 | 9,866 | Cohere | CC-BY-NC-4.0 |
| 250 | llama-3-70b-instructMeta | 1,276 | +/-4 | 156,876 | Meta | Llama 3 Community |
| 251 | gpt-4-0613OpenAI | 1,274 | +/-4 | 88,723 | OpenAI | Proprietary |
| 252 | mistral-small-24b-instruct-2501Mistral | 1,274 | +/-6 | 14,681 | Mistral | Apache 2.0 |
| 253 | glm-4-0520Z.ai | 1,273 | +/-7 | 9,788 | Z.ai | Proprietary |
| 254 | reka-flash-20240904Reka AI | 1,272 | +/-7 | 7,536 | Reka AI | Proprietary |
| 255 | qwen2.5-coder-32b-instructAlibaba | 1,270 | +/-8 | 5,432 | Alibaba | Apache 2.0 |
| 256 | c4ai-aya-expanse-32bCohere | 1,267 | +/-5 | 27,124 | Cohere | CC-BY-NC-4.0 |
| 257 | gemma-2-9b-itGoogle | 1,266 | +/-4 | 54,611 | Gemma license | |
| 258 | deepseek-coder-v2DeepSeek | 1,264 | +/-6 | 15,147 | DeepSeek | DeepSeek License |
| 259 | command-r-plusCohere | 1,261 | +/-4 | 77,554 | Cohere | CC-BY-NC-4.0 |
| 260 | qwen2-72b-instructAlibaba | 1,261 | +/-5 | 37,325 | Alibaba | Qianwen LICENSE |
| 261 | claude-3-haiku-20240307Anthropic | 1,260 | +/-4 | 117,701 | Anthropic | Proprietary |
| 262 | amazon-nova-lite-v1.0Amazon | 1,260 | +/-5 | 19,372 | Amazon | Proprietary |
| 263 | gemini-1.5-flash-8b-001Google | 1,258 | +/-4 | 35,558 | Proprietary | |
| 264 | Phi 4 - 14BMicrosoft Azure | 1,256 | +/-5 | 24,126 | Microsoft Azure | MIT |
| 265 | olmo-2-0325-32b-instructAi2 | 1,251 | +/-11 | 3,334 | Ai2 | Apache-2.0 |
| 266 | command-r-08-2024Cohere | 1,249 | +/-7 | 10,140 | Cohere | CC-BY-NC-4.0 |
| 267 | mistral-large-2402Mistral | 1,241 | +/-5 | 62,436 | Mistral | Proprietary |
| 268 | amazon-nova-micro-v1.0Amazon | 1,241 | +/-5 | 19,364 | Amazon | Proprietary |
| 269 | jamba-1.5-miniAI21 Labs | 1,239 | +/-7 | 8,858 | AI21 Labs | Jamba Open |
| 270 | ministral-8b-2410Mistral | 1,237 | +/-9 | 4,781 | Mistral | MRL |
| 271 | gemini-pro-dev-apiGoogle | 1,235 | +/-7 | 18,354 | Proprietary | |
| 272 | qwen1.5-110b-chatAlibaba | 1,233 | +/-6 | 26,195 | Alibaba | Qianwen LICENSE |
| 273 | hunyuan-standard-256kTencent | 1,233 | +/-12 | 2,728 | Tencent | Proprietary |
| 274 | reka-flash-21b-20240226-onlineReka AI | 1,233 | +/-7 | 15,450 | Reka AI | Proprietary |
| 275 | qwen1.5-72b-chatAlibaba | 1,232 | +/-5 | 39,302 | Alibaba | Qianwen LICENSE |
| 276 | mixtral-8x22b-instruct-v0.1Mistral | 1,229 | +/-5 | 51,416 | Mistral | Apache 2.0 |
| 277 | command-rCohere | 1,226 | +/-5 | 54,036 | Cohere | CC-BY-NC-4.0 |
| 278 | reka-flash-21b-20240226Reka AI | 1,226 | +/-6 | 24,806 | Reka AI | Proprietary |
| 279 | gpt-3.5-turbo-0125OpenAI | 1,224 | +/-5 | 66,207 | OpenAI | Proprietary |
| 280 | llama-3-8b-instructMeta | 1,223 | +/-4 | 104,642 | Meta | Llama 3 Community |
| 281 | c4ai-aya-expanse-8bCohere | 1,223 | +/-7 | 9,818 | Cohere | CC-BY-NC-4.0 |
| 282 | mistral-mediumMistral | 1,222 | +/-6 | 34,550 | Mistral | Proprietary |
| 283 | gemini-proGoogle | 1,222 | +/-12 | 6,390 | Proprietary | |
| 284 | llama-3.1-tulu-3-8bAi2 | 1,221 | +/-11 | 2,896 | Ai2 | Llama 3.1 |
| 285 | yi-1.5-34b-chat01 AI | 1,213 | +/-5 | 24,146 | 01 AI | Apache-2.0 |
| 286 | zephyr-orpo-141b-A35b-v0.1HuggingFace | 1,212 | +/-11 | 4,652 | HuggingFace | Apache 2.0 |
| 287 | llama-3.1-8b-instructMeta | 1,211 | +/-4 | 49,605 | Meta | Llama 3.1 Community |
| 288 | granite-3.1-8b-instructIBM | 1,208 | +/-11 | 3,090 | IBM | Apache 2.0 |
| 289 | qwen1.5-32b-chatAlibaba | 1,203 | +/-6 | 21,741 | Alibaba | Qianwen LICENSE |
| 290 | gpt-3.5-turbo-1106OpenAI | 1,202 | +/-9 | 16,619 | OpenAI | Proprietary |
| 291 | gemma-2-2b-itGoogle | 1,199 | +/-4 | 46,616 | Gemma license | |
| 292 | phi-3-medium-4k-instructMicrosoft | 1,197 | +/-5 | 25,055 | Microsoft | MIT |
| 293 | mixtral-8x7b-instruct-v0.1Mistral | 1,196 | +/-4 | 73,503 | Mistral | Apache 2.0 |
| 294 | dbrx-instruct-previewDatabricks | 1,194 | +/-6 | 32,191 | Databricks | DBRX LICENSE |
| 295 | internlm2_5-20b-chatInternLM | 1,191 | +/-7 | 9,901 | InternLM | Other |
| 296 | qwen1.5-14b-chatAlibaba | 1,190 | +/-7 | 17,839 | Alibaba | Qianwen LICENSE |
| 297 | wizardlm-70bMicrosoft | 1,184 | +/-9 | 8,214 | Microsoft | Llama 2 Community |
| 298 | deepseek-llm-67b-chatDeepSeek | 1,184 | +/-12 | 4,932 | DeepSeek | DeepSeek License |
| 299 | yi-34b-chat01 AI | 1,183 | +/-7 | 15,483 | 01 AI | Yi License |
| 300 | granite-3.0-8b-instructIBM | 1,181 | +/-9 | 6,638 | IBM | Apache 2.0 |
| 301 | openchat-3.5OpenChat | 1,181 | +/-10 | 7,968 | OpenChat | Apache-2.0 |
| 302 | openchat-3.5-0106OpenChat | 1,181 | +/-8 | 12,637 | OpenChat | Apache-2.0 |
| 303 | gemma-1.1-7b-itGoogle | 1,181 | +/-6 | 23,893 | Gemma license | |
| 304 | snowflake-arctic-instructSnowflake | 1,179 | +/-6 | 32,832 | Snowflake | Apache 2.0 |
| 305 | granite-3.1-2b-instructIBM | 1,178 | +/-11 | 3,188 | IBM | Apache 2.0 |
| 306 | tulu-2-dpo-70bAllenAI/UW | 1,177 | +/-10 | 6,535 | AllenAI/UW | AI2 ImpACT Low-risk |
| 307 | openhermes-2.5-mistral-7bNousResearch | 1,174 | +/-10 | 5,006 | NousResearch | Apache-2.0 |
| 308 | vicuna-33bLMSYS | 1,172 | +/-6 | 22,479 | LMSYS | Non-commercial |
| 309 | starling-lm-7b-betaNexusflow | 1,171 | +/-7 | 16,056 | Nexusflow | Apache-2.0 |
| 310 | phi-3-small-8k-instructMicrosoft | 1,170 | +/-6 | 17,766 | Microsoft | MIT |
| 311 | llama-2-70b-chatMeta | 1,170 | +/-6 | 38,492 | Meta | Llama 2 Community |
| 312 | starling-lm-7b-alphaUC Berkeley | 1,167 | +/-8 | 10,224 | UC Berkeley | CC-BY-NC-4.0 |
| 313 | llama-3.2-3b-instructMeta | 1,166 | +/-8 | 7,936 | Meta | Llama 3.2 |
| 314 | nous-hermes-2-mixtral-8x7b-dpoNousResearch | 1,164 | +/-12 | 3,777 | NousResearch | Apache-2.0 |
| 315 | qwq-32b-previewAlibaba | 1,155 | +/-11 | 3,231 | Alibaba | Apache 2.0 |
| 316 | granite-3.0-2b-instructIBM | 1,155 | +/-8 | 6,837 | IBM | Apache 2.0 |
| 317 | llama2-70b-steerlm-chatNvidia | 1,154 | +/-13 | 3,585 | Nvidia | Llama 2 Community |
| 318 | solar-10.7b-instruct-v1.0Upstage AI | 1,151 | +/-13 | 4,155 | Upstage AI | CC-BY-NC-4.0 |
| 319 | dolphin-2.2.1-mistral-7bCognitive Computations | 1,151 | +/-15 | 1,679 | Cognitive Computations | Apache-2.0 |
| 320 | mpt-30b-chatMosaicML | 1,149 | +/-12 | 2,572 | MosaicML | CC-BY-NC-SA-4.0 |
| 321 | mistral-7b-instruct-v0.2Mistral | 1,149 | +/-7 | 19,402 | Mistral | Apache-2.0 |
| 322 | wizardlm-13bMicrosoft | 1,148 | +/-9 | 7,044 | Microsoft | Llama 2 Community |
| 323 | falcon-180b-chatTII | 1,146 | +/-17 | 1,295 | TII | Falcon-180B TII License |
| 324 | qwen1.5-7b-chatAlibaba | 1,143 | +/-10 | 4,737 | Alibaba | Qianwen LICENSE |
| 325 | phi-3-mini-4k-instruct-june-2024Microsoft | 1,142 | +/-6 | 12,297 | Microsoft | MIT |
| 326 | llama-2-13b-chatMeta | 1,141 | +/-7 | 19,174 | Meta | Llama 2 Community |
| 327 | vicuna-13bLMSYS | 1,140 | +/-7 | 19,367 | LMSYS | Llama 2 Community |
| 328 | qwen-14b-chatAlibaba | 1,138 | +/-11 | 4,964 | Alibaba | Qianwen LICENSE |
| 329 | palm-2Google | 1,137 | +/-9 | 8,554 | Proprietary | |
| 330 | gemma-7b-itGoogle | 1,136 | +/-9 | 8,925 | Gemma license | |
| 331 | codellama-34b-instructMeta | 1,136 | +/-9 | 7,366 | Meta | Llama 2 Community |
| 332 | zephyr-7b-betaHuggingFace | 1,130 | +/-9 | 11,118 | HuggingFace | MIT |
| 333 | phi-3-mini-128k-instructMicrosoft | 1,128 | +/-7 | 20,685 | Microsoft | MIT |
| 334 | phi-3-mini-4k-instructMicrosoft | 1,127 | +/-6 | 20,118 | Microsoft | MIT |
| 335 | guanaco-33bUW | 1,126 | +/-12 | 2,921 | UW | Non-commercial |
| 336 | zephyr-7b-alphaHuggingFace | 1,126 | +/-16 | 1,785 | HuggingFace | MIT |
| 337 | stripedhyena-nous-7bTogether AI | 1,120 | +/-11 | 5,182 | Together AI | Apache 2.0 |
| 338 | codellama-70b-instructMeta | 1,118 | +/-18 | 1,143 | Meta | Llama 2 Community |
| 339 | gemma-1.1-2b-itGoogle | 1,115 | +/-8 | 10,854 | Gemma license | |
| 340 | vicuna-7bLMSYS | 1,114 | +/-9 | 6,923 | LMSYS | Llama 2 Community |
| 341 | smollm2-1.7b-instructHuggingFace | 1,114 | +/-14 | 2,199 | HuggingFace | Apache 2.0 |
| 342 | llama-3.2-1b-instructMeta | 1,110 | +/-8 | 8,045 | Meta | Llama 3.2 |
| 343 | mistral-7b-instructMistral | 1,109 | +/-9 | 8,977 | Mistral | Apache 2.0 |
| 344 | llama-2-7b-chatMeta | 1,107 | +/-7 | 14,148 | Meta | Llama 2 Community |
| 345 | gemma-2b-itGoogle | 1,092 | +/-11 | 4,780 | Gemma license | |
| 346 | qwen1.5-4b-chatAlibaba | 1,089 | +/-9 | 7,597 | Alibaba | Qianwen LICENSE |
| 347 | olmo-7b-instructAi2 | 1,073 | +/-11 | 6,328 | Ai2 | Apache-2.0 |
| 348 | koala-13bUC Berkeley | 1,070 | +/-10 | 6,965 | UC Berkeley | Non-commercial |
| 349 | alpaca-13bStanford | 1,067 | +/-11 | 5,745 | Stanford | Non-commercial |
| 350 | gpt4all-13b-snoozyNomic AI | 1,065 | +/-15 | 1,743 | Nomic AI | Non-commercial |
| 351 | mpt-7b-chatMosaicML | 1,061 | +/-12 | 3,924 | MosaicML | CC-BY-NC-SA-4.0 |
| 352 | chatglm3-6bTsinghua | 1,055 | +/-12 | 4,658 | Tsinghua | Apache-2.0 |
| 353 | RWKV-4-Raven-14BRWKV | 1,041 | +/-11 | 4,845 | RWKV | Apache 2.0 |
| 354 | chatglm2-6bTsinghua | 1,023 | +/-14 | 2,658 | Tsinghua | Apache-2.0 |
| 355 | oasst-pythia-12bOpenAssistant | 1,021 | +/-11 | 6,310 | OpenAssistant | Apache 2.0 |
| 356 | chatglm-6bTsinghua | 995 | +/-13 | 4,914 | Tsinghua | Non-commercial |
| 357 | fastchat-t5-3bLMSYS | 991 | +/-12 | 4,203 | LMSYS | Apache 2.0 |
| 358 | dolly-v2-12bDatabricks | 980 | +/-14 | 3,412 | Databricks | MIT |
| 359 | llama-13bMeta | 972 | +/-16 | 2,391 | Meta | Non-commercial |
| 360 | stablelm-tuned-alpha-7bStability AI | 952 | +/-13 | 3,287 | Stability AI | CC-BY-NC-SA-4.0 |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。
常见问题 (FAQ)
什么是 Text Generation Arena (LMArena)?
Text Generation Arena(原 LMSYS Chatbot Arena)是目前最具影响力的大模型匿名评测平台。用户向两个身份未知的模型提问,根据回答质量投票,系统通过 Elo 算法将数百万次投票汇聚为动态排行榜,被学术界和工业界广泛引用。
Arena Elo 分数是如何计算的?
Elo 算法源自国际象棋评分体系。每次对战后,胜者得分上升、败者下降,幅度取决于双方原始评分差距。95% 置信区间(CI)反映该模型参与对战次数的多少:CI 越窄说明数据越充分、排名越可信。
为什么同一模型会出现"Thinking"和普通两个版本?
部分模型支持"扩展思考"(Extended Thinking)模式,会在给出最终答案前进行更深入的内部推理。该模式通常在逻辑推理、数学和编程任务上得分更高,但响应时延也更长、成本更高。Arena 将两种模式分开评测,以便用户根据实际需求选择。
如何根据排行榜选择适合自己的大语言模型?
建议综合考虑:综合性能(看 Elo 总分)、成本(闭源 API 按量计费,开源可自部署)、中文支持、开源程度以及响应速度。

















