DataLearner 标志DataLearnerAI
最新AI资讯
大模型排行榜
大模型评测基准
大模型列表
大模型对比
资源中心
工具
语言中文
DataLearner 标志DataLearner AI

专注大模型评测、数据资源与实践教学的知识平台,持续更新可落地的 AI 能力图谱。

产品

  • 评测榜单
  • 模型对比
  • 数据资源

资源

  • 部署教程
  • 原创内容
  • 工具导航

关于

  • 关于我们
  • 隐私政策
  • 数据收集方法
  • 联系我们

© 2026 DataLearner AI. DataLearner 持续整合行业数据与案例,为科研、企业与开发者提供可靠的大模型情报与实践指南。

隐私政策服务条款
首页综合排行榜LMArena Math Arena 数学推理能力排行榜

LMArena 评测赛道

文本生成代码数学图像编辑文字生成视频图生视频文生图

LMArena Math Arena 数学推理能力排行榜

基于 LMArena Math Arena 用户匿名投票的最新AI大模型数学推理能力排行榜,涵盖各模型的 Elo 得分、95% 置信区间、投票量、机构与许可证。

榜首模型

gemini-3.5-flash

最高得分

1524.00

模型数量

349

数据版本

2026年05月28日

数据来源: LM Arena

关于本排行榜

本排行榜展示了当前 AI 大模型在数学推理任务中的实力排名。数据来源于 LMArena 的 Math 子赛道,通过真实用户匿名盲测投票评估各模型在数学解题任务中的表现。

评测方法概要

匿名盲测:用户提出数学题目后,由两个"隐藏身份"的模型分别作答,用户投票选出解题更优的一方,排除品牌偏见。

Elo 评分:采用 Bradley-Terry 模型计算 Elo 分数,分数越高说明该模型在数学场景中被用户更频繁地选择。

来源:全部国产模型
榜单历史快照月份:

排名总表

排名模型名称得分95% CI投票数机构许可证
Googlegemini-3.5-flashGoogle1524.00+/-26526GoogleProprietary

数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。

常见问题 (FAQ)

01

什么是 LMArena Math Arena?

LMArena Math Arena 是 LMArena 旗下专注于数学推理能力的匿名评测平台。用户提交真实数学问题(如代数、几何、竞赛数学等),系统将不同模型的解题过程并排展示(隐藏模型名称),由用户投票选出更好的解答,最终通过 Elo 算法汇总形成动态排行榜。

02

Math Arena 与 MATH-500、AIME 等静态基准有什么区别?

MATH-500、AIME、AMC 等静态基准使用固定题目集和自动评分,可重现性强但容易被针对性优化("刷榜")。Math Arena 来自真实用户的开放式数学问题,测试内容不固定,更能反映模型在实际数学场景中的自然表现,两者互为补充。

03

思考模型(Thinking Model)在数学 Arena 中表现更好吗?

整体而言,具备思维链(Chain-of-Thought)或扩展推理能力的模型在数学 Arena 中往往排名更高。Claude Opus 系列 Thinking 模式、GPT 高算力模式以及 DeepSeek 思考版本均在榜单前列,说明延长推理时间对数学问题的解答质量有显著提升。

04

国产大模型在数学能力方面表现如何?

DeepSeek、Qwen3 系列、GLM 等国产模型在 Math Arena 表现亮眼,已跻身全球前列。DeepSeek 以 MIT 协议开源,Qwen3-235B 等系列支持中文数学场景,是选择开源数学推理模型的重要参考。

覆盖多种数学场景:包括代数、几何、计算推理、竞赛数学等多元化的真实数学任务。

DataLearner 在原始数据基础上提供中文解读与深度分析,并将排行榜模型关联至 DataLearner 模型库,方便您一键查看模型详情、API 定价、评测得分等完整信息。

OpenAIgpt-5.4-highOpenAI
1515.00
+/-15
1,682
OpenAI
Proprietary
AnthropicClaude Opus 4.6 (thinking)Anthropic1514.00+/-141,981AnthropicProprietary
4AnthropicClaude Opus 4.6Anthropic1506.00+/-132,230AnthropicProprietary
5Google Deep MindGemini 3.1 Pro PreviewGoogle Deep Mind1500.00+/-122,601Google Deep MindProprietary
6AnthropicOpus 4.7 (thinking)Anthropic1500.00+/-181,161AnthropicProprietary
7OpenAIgpt-5.5-highOpenAI1499.00+/-191,003OpenAIProprietary
8OpenAIGPT-5.5OpenAI1497.00+/-191,000OpenAIProprietary
9Alibabaqwen3.7-max-previewAlibaba1495.00+/-40220AlibabaProprietary
10AnthropicOpus 4.7Anthropic1494.00+/-171,170AnthropicProprietary
11Alibabaqwen3.6-max-previewAlibaba1492.00+/-31327AlibabaProprietary
12XImimo-v2.5-proXiaomi1486.00+/-20866XiaomiMIT
13智谱GLM 5.1智谱AI1481.00+/-20860智谱AIMIT
14Baiduernie-5.1Baidu1480.00+/-20836BaiduProprietary
15DeepSeekdeepseek-v4-pro-thinkingDeepSeek1479.00+/-20886DeepSeekMIT
16Google Deep MindGemini 3.0 Pro (Preview 11-2025)Google Deep Mind1478.00+/-112,653Google Deep MindProprietary
17Moonshotkimi-k2.6Moonshot1478.00+/-19887MoonshotModified MIT
18Google Deep MindGemini 3.0 FlashGoogle Deep Mind1476.00+/-132,004Google Deep MindProprietary
19Alibabaqwen3.5-max-previewAlibaba1472.00+/-161,272AlibabaProprietary
20xAIgrok-4.20-beta-0309-reasoningxAI1472.00+/-151,734xAIProprietary
21Moonshot AIKimi K2 ThinkingMoonshot AI1472.00+/-122,259Moonshot AIModified MIT
22AnthropicClaude Opus 4 (thinking-32k)Anthropic1470.00+/-122,265AnthropicProprietary
23Googlegemma-4-31bGoogle1470.00+/-28398GoogleApache 2.0
24Googlegemma-4-26b-a4bGoogle1468.00+/-28369GoogleApache 2.0
25AnthropicClaude Opus 4Anthropic1467.00+/-104,078AnthropicProprietary
26OpenAIgpt-5.5-instantOpenAI1466.00+/-161,409OpenAIProprietary
27FAMuse SparkFacebook AI研究实验室1463.00+/-20795Facebook AI研究实验室Proprietary
28OpenAIGPT-5.2 Pro (high)OpenAI1460.00+/-112,889OpenAIProprietary
29xAIgrok-4.20-multi-agent-beta-0309xAI1459.00+/-141,704xAIProprietary
30AnthropicClaude Sonnet 4.6Anthropic1458.00+/-141,719AnthropicProprietary
31OpenAIgpt-5.2-chat-latest-20260210OpenAI1457.00+/-141,985OpenAIProprietary
32Alibabaqwen3.6-plusAlibaba1456.00+/-181,112AlibabaProprietary
33OpenAIGPT-5.1 Pro (high)OpenAI1455.00+/-122,500OpenAIProprietary
34OpenAIGPT-5.4OpenAI1455.00+/-141,763OpenAIProprietary
35Google Deep MindGemini 3.0 Flash (minimal)Google Deep Mind1455.00+/-113,175Google Deep MindProprietary
36AnthropicClaude Sonnet 4.5 (thinking-32k)Anthropic1454.00+/-94,651AnthropicProprietary
37XImimo-v2-proXiaomi1453.00+/-151,538XiaomiProprietary
38xAIgrok-4.20-beta1xAI1452.00+/-151,491xAIProprietary
39Bytedancedola-seed-2.0-proBytedance1450.00+/-132,260BytedanceProprietary
40OpenAIOpenAI o3OpenAI1448.00+/-103,732OpenAIProprietary
41DeepSeekdeepseek-v4-flashDeepSeek1446.00+/-19992DeepSeekMIT
42阿里Qwen3.5-397B-A17B阿里巴巴1446.00+/-132,045阿里巴巴Apache 2.0
43xAIGrok 4.1 ThinkingxAI1444.00+/-103,727xAIProprietary
44AnthropicOpus 4.1 (thinking-16k)Anthropic1443.00+/-113,028AnthropicProprietary
45XImimo-v2.5Xiaomi1442.00+/-19914XiaomiMIT
46Google Deep MindGemini 2.5 Pro Experimental 03-25Google Deep Mind1442.00+/-77,541Google Deep MindProprietary
47Moonshotkimi-k2.5-instantMoonshot1442.00+/-25515MoonshotModified MIT
48DeepSeekdeepseek-v4-flash-thinkingDeepSeek1441.00+/-19948DeepSeekMIT
49智谱GLM-5智谱AI1440.00+/-161,350智谱AIMIT
50Moonshotkimi-k2-thinking-turboMoonshot1440.00+/-103,688MoonshotModified MIT
51Meituanlongcat-flash-chat-2602-expMeituan1440.00+/-151,513MeituanProprietary
52OpenAIgpt-5.4-mini-highOpenAI1439.00+/-151,586OpenAIProprietary
53阿里Qwen3 Max (Preview)阿里巴巴1439.00+/-151,524阿里巴巴Proprietary
54百度ERNIE 5.0百度1439.00+/-132,094百度Proprietary
55OpenAIgpt-5.4-nano-highOpenAI1437.00+/-151,474OpenAIProprietary
56DeepSeekdeepseek-v4-proDeepSeek1437.00+/-181,045DeepSeekMIT
57Googlegemini-3.1-flash-lite-previewGoogle1436.00+/-132,157GoogleProprietary
58OpenAIGPT-5-Pro (high)OpenAI1434.00+/-141,887OpenAIProprietary
59AnthropicOpus 4.1Anthropic1433.00+/-94,723AnthropicProprietary
60OpenAIGPT-5.2OpenAI1433.00+/-112,794OpenAIProprietary
61xAIGrok 4.1xAI1431.00+/-94,143xAIProprietary
62DeepSeek-AIDeepSeek V3.2DeepSeek-AI1430.00+/-112,954DeepSeek-AIMIT
63Alibabaqwen3-max-2025-09-23Alibaba1429.00+/-24584AlibabaProprietary
64智谱GLM-4.7智谱AI1429.00+/-21711智谱AIMIT
65DeepSeek-AIDeepSeek V3.2-Exp (thinking)DeepSeek-AI1429.00+/-26481DeepSeek-AIMIT
66Amazonamazon-nova-experimental-chat-26-02-10Amazon1428.00+/-39207AmazonProprietary
67Tencenthunyuan-hy3-previewTencent1428.00+/-28378Tencenttencent-hunyuan-community
68xAIgrok-4-0709xAI1428.00+/-122,264xAIProprietary
69Alibabaqwen3.5-27bAlibaba1428.00+/-151,561AlibabaApache 2.0
70OpenAIgpt-5.3-chat-latestOpenAI1428.00+/-141,944OpenAIProprietary
71AnthropicClaude Sonnet 4.5Anthropic1427.00+/-94,673AnthropicProprietary
72DeepSeek-AIDeepSeek V3.2 (thinking)DeepSeek-AI1426.00+/-122,456DeepSeek-AIMIT
73xAIgrok-4.3xAI1425.00+/-20846xAIProprietary
74xAIGrok 4 FastxAI1424.00+/-29399xAIProprietary
75OpenAIGPT-5.1 InstantOpenAI1424.00+/-112,867OpenAIProprietary
76Alibabaqwen3.5-122b-a10bAlibaba1422.00+/-141,682AlibabaApache 2.0
77智谱GLM-4.6智谱AI1421.00+/-132,107智谱AIMIT
78xAIgrok-4-1-fast-reasoningxAI1420.00+/-103,382xAIProprietary
79AnthropicClaude Opus 4 (thinking-16k)Anthropic1420.00+/-122,239AnthropicProprietary
80阿里Qwen3-235B-A22B-2507阿里巴巴1420.00+/-85,844阿里巴巴Apache 2.0
81Alibabaqwen3-next-80b-a3b-instructAlibaba1419.00+/-171,212AlibabaApache 2.0
82DeepSeek-AIDeepSeek V3.2-ExpDeepSeek-AI1418.00+/-21775DeepSeek-AIMIT
83Meituanlongcat-flash-chatMeituan1417.00+/-22689MeituanMIT
84Moonshotkimi-k2-0905-previewMoonshot1416.00+/-21759MoonshotModified MIT
85OpenAIo4-mini-2025-04-16OpenAI1416.00+/-112,939OpenAIProprietary
86DeepSeek-AIDeepSeek-V3.1DeepSeek-AI1415.00+/-18992DeepSeek-AIMIT
87MiniMaxAIMiniMax-M2.7MiniMaxAI1415.00+/-161,378MiniMaxAIModified MIT
88DeepSeek-AIDeepSeek-V3.1 (thinking)DeepSeek-AI1414.00+/-22665DeepSeek-AIMIT
89智谱GLM-4.5智谱AI1413.00+/-151,424智谱AIMIT
90OpenAIGPT-5OpenAI1413.00+/-141,787OpenAIProprietary
91Googlegemini-2.5-flash-preview-09-2025Google1413.00+/-131,945GoogleProprietary
92xAIgrok-4-fast-reasoningxAI1412.00+/-181,085xAIProprietary
93DeepSeek-AIDeepSeek-R1DeepSeek-AI1411.00+/-141,606DeepSeek-AIMIT
94DeepSeekdeepseek-v3.1-terminus-thinkingDeepSeek1410.00+/-41200DeepSeekMIT
95阿里Qwen3-VL-235B-A22B-Instruct阿里巴巴1410.00+/-23704阿里巴巴Apache 2.0
96Amazonamazon-nova-experimental-chat-26-01-10Amazon1409.00+/-33263AmazonProprietary
97OpenAIGPT-4.5OpenAI1409.00+/-151,393OpenAIProprietary
98OpenAIo1-2024-12-17OpenAI1409.00+/-112,986OpenAIProprietary
99百度ERNIE 5.0百度1407.00+/-23618百度Proprietary
100Google Deep MindGemini 2.5 FlashGoogle Deep Mind1407.00+/-77,775Google Deep MindProprietary
101StepFunAIStep 3.5 FlashStepFunAI1406.00+/-132,146StepFunAIApache 2.0
102OpenAIgpt-5-mini-highOpenAI1406.00+/-151,460OpenAIProprietary
103OpenAIo3-mini-highOpenAI1406.00+/-131,909OpenAIProprietary
104Alibabaqwen3-vl-235b-a22b-thinkingAlibaba1405.00+/-28428AlibabaApache 2.0
105OpenAIchatgpt-4o-latest-20250326OpenAI1404.00+/-85,725OpenAIProprietary
106AnthropicClaude Opus 4Anthropic1403.00+/-112,768AnthropicProprietary
107AnthropicClaude Sonnet 4 (thinking-32k)Anthropic1403.00+/-132,023AnthropicProprietary
108Alibabaqwen3.5-flashAlibaba1403.00+/-141,865AlibabaProprietary
109MistralAIMistral Large 3MistralAI1402.00+/-112,737MistralAIApache 2.0
110Alibabaqwen3.5-35b-a3bAlibaba1402.00+/-141,666AlibabaApache 2.0
111Tencenthunyuan-t1-20250711Tencent1402.00+/-38236TencentProprietary
112Amazonamazon-nova-experimental-chat-12-10Amazon1400.00+/-37234AmazonProprietary
113百度ERNIE 5.0百度1400.00+/-34268百度Proprietary
114MistralAIMagistral-Medium-2506MistralAI1399.00+/-85,729MistralAIProprietary
115Alibabaqwen3-32bAlibaba1399.00+/-30316AlibabaApache 2.0
116Alibabaqwen3-235b-a22b-thinking-2507Alibaba1398.00+/-24490AlibabaApache 2.0
117MiniMaxAIMiniMax M2.5MiniMaxAI1398.00+/-132,188MiniMaxAIModified MIT
118Amazonamazon-nova-experimental-chat-11-10Amazon1398.00+/-151,584AmazonProprietary
119DeepSeek-AIDeepSeek-R1-0528DeepSeek-AI1396.00+/-20869DeepSeek-AIMIT
120Amazonamazon-nova-experimental-chat-10-20Amazon1396.00+/-20805AmazonProprietary
121DeepSeek-AIDeepSeek-V3.1 TerminusDeepSeek-AI1395.00+/-39218DeepSeek-AIMIT
122Anthropicclaude-haiku-4-5-20251001Anthropic1395.00+/-94,744AnthropicProprietary
123Alibabaqwen3-235b-a22b-no-thinkingAlibaba1394.00+/-122,390AlibabaApache 2.0
124Alibabaqwen3-235b-a22bAlibaba1393.00+/-141,604AlibabaApache 2.0
125MiniMaxminimax-m2.1-previewMiniMax1393.00+/-181,010MiniMaxMIT
126Z.aiglm-4.5-airZ.ai1391.00+/-151,540Z.aiMIT
127Nvidianvidia-llama-3.3-nemotron-super-49b-v1.5Nvidia1390.00+/-39194NvidiaNvidia Open
128Alibabaqwen3-next-80b-a3b-thinkingAlibaba1389.00+/-20829AlibabaApache 2.0
129ARtrinity-large-thinkingArcee AI1389.00+/-161,358Arcee AIApache 2.0
130xAIgrok-3-mini-highxAI1388.00+/-18977xAIProprietary
131Moonshot AIKimi K2Moonshot AI1388.00+/-141,694Moonshot AIModified MIT
132AnthropicClaude Sonnet 4Anthropic1388.00+/-122,474AnthropicProprietary
133OpenAIo1-previewOpenAI1386.00+/-104,569OpenAIProprietary
134AnthropicClaude Sonnet 3.7 (thinking-32k)Anthropic1384.00+/-112,793AnthropicProprietary
135PRintellect-3Prime Intellect1384.00+/-31332Prime IntellectMIT
136OpenAIGPT OSS 120BOpenAI1383.00+/-141,795OpenAIApache 2.0
137OpenAIo3-miniOpenAI1382.00+/-84,722OpenAIProprietary
138Alibabaqwen3-30b-a3b-instruct-2507Alibaba1381.00+/-151,427AlibabaApache 2.0
139XImimo-v2-flash (non-thinking)Xiaomi1380.00+/-112,746XiaomiMIT
140Nvidiallama-3.1-nemotron-ultra-253b-v1Nvidia1380.00+/-37209NvidiaNvidia Open Model
141Alibabaqwen3-coder-480b-a35b-instructAlibaba1377.00+/-151,627AlibabaApache 2.0
142xAIGrok 3xAI1375.00+/-112,677xAIProprietary
143Nvidianvidia-nemotron-3-super-120b-a12bNvidia1375.00+/-25511NvidiaNVIDIA Open Model
144XImimo-v2-flash (thinking)Xiaomi1374.00+/-22633XiaomiMIT
145OpenAIgpt-4.1-2025-04-14OpenAI1373.00+/-103,227OpenAIProprietary
146MiniMaxminimax-m1MiniMax1371.00+/-131,799MiniMaxApache 2.0
147DeepSeek-AIDeepSeek-V3-0324DeepSeek-AI1370.00+/-103,191DeepSeek-AIMIT
148xAIgrok-3-mini-betaxAI1370.00+/-141,530xAIProprietary
149Z.aiglm-4.7-flashZ.ai1366.00+/-21718Z.aiMIT
150Googlegemini-2.5-flash-lite-preview-06-17-thinkingGoogle1365.00+/-122,094GoogleProprietary
151Googlegemini-2.5-flash-lite-preview-09-2025-no-thinkingGoogle1365.00+/-112,875GoogleProprietary
152Alibabaqwen2.5-maxAlibaba1364.00+/-103,306AlibabaProprietary
153Alibabaqwq-32bAlibaba1364.00+/-141,720AlibabaApache 2.0
154StepFunstep-3StepFun1364.00+/-31353StepFunApache 2.0
155AnthropicClaude Sonnet 3.7Anthropic1362.00+/-103,357AnthropicProprietary
156OpenAIo1-miniOpenAI1362.00+/-87,499OpenAIProprietary
157ARtrinity-large-previewArcee AI1361.00+/-141,813Arcee AIApache 2.0
158Z.aiglm-4.5vZ.ai1357.00+/-34276Z.aiMIT
159MiniMaxminimax-m2MiniMax1357.00+/-33318MiniMaxApache 2.0
160Googlegemini-2.0-flash-001Google1356.00+/-94,067GoogleProprietary
161ANling-flash-2.0Ant Group1355.00+/-27461Ant GroupMIT
162OpenAIgpt-4.1-mini-2025-04-14OpenAI1355.00+/-112,693OpenAIProprietary
163Nvidianvidia-nemotron-3-nano-30b-a3b-bf16Nvidia1354.00+/-19987NvidiaNVIDIA Open Model
164Alibabaqwen3-30b-a3bAlibaba1353.00+/-141,708AlibabaApache 2.0
165Anthropicclaude-3-5-sonnet-20241022Anthropic1350.00+/-710,019AnthropicProprietary
166Mistralmistral-medium-2505Mistral1349.00+/-122,229MistralProprietary
167Tencenthunyuan-turbos-20250416Tencent1348.00+/-20845TencentProprietary
168OpenAIgpt-5-nano-highOpenAI1345.00+/-27494OpenAIProprietary
169Anthropicclaude-3-5-sonnet-20240620Anthropic1341.00+/-711,359AnthropicProprietary
170ANring-flash-2.0Ant Group1339.00+/-27453Ant GroupMIT
171Mistralmistral-small-2506Mistral1339.00+/-181,042MistralApache 2.0
172Googlegemini-1.5-pro-002Google1338.00+/-77,610GoogleProprietary
173OpenAIGPT OSS 20BOpenAI1336.00+/-22680OpenAIApache 2.0
174Amazonnova-2-liteAmazon1335.00+/-20825AmazonProprietary
175Googlegemini-2.0-flash-lite-preview-02-05Google1326.00+/-102,814GoogleProprietary
176Alibabaqwen-plus-0125Alibaba1324.00+/-19732AlibabaProprietary
177Googlegemma-3-27b-itGoogle1322.00+/-93,579GoogleGemma
178Metallama-3.1-405b-instruct-fp8Meta1319.00+/-88,482MetaLlama 3.1 Community
179Metallama-4-maverick-17b-128e-instructMeta1319.00+/-112,839MetaLlama 4
180Googlegemma-3-12b-itGoogle1317.00+/-27389GoogleGemma
181Metallama-3.1-405b-instruct-bf16Meta1315.00+/-85,215MetaLlama 3.1 Community
182IBgranite-4.1-8bIBM1315.00+/-40218IBMApache 2.0
183StepFunstep-2-16k-exp-202412StepFun1313.00+/-20642StepFunProprietary
184NEathene-v2-chatNexusFlow1312.00+/-93,412NexusFlowNexusFlow
185AnthropicClaude3-OpusAnthropic1312.00+/-625,769AnthropicProprietary
186AIolmo-3-32b-thinkAi21311.00+/-32314Ai2Apache 2.0
187DeepSeekdeepseek-v3DeepSeek1311.00+/-112,721DeepSeekDeepSeek
188Coherecommand-a-03-2025Cohere1309.00+/-93,991CohereCC-BY-NC-4.0
189Metallama-4-scout-17b-16e-instructMeta1309.00+/-131,944MetaLlama
190OpenAIgpt-4o-2024-08-06OpenAI1308.00+/-86,826OpenAIProprietary
191AIolmo-3.1-32b-instructAi21307.00+/-23696Ai2Apache 2.0
19201yi-lightning01 AI1306.00+/-103,92101 AIProprietary
193OpenAIgpt-4o-2024-05-13OpenAI1305.00+/-715,103OpenAIProprietary
194Googlegemini-advanced-0514Google1305.00+/-106,395GoogleProprietary
195Alibabaqwen2.5-plus-1127Alibaba1305.00+/-141,404AlibabaProprietary
196OpenAIgpt-4-1106-previewOpenAI1303.00+/-813,306OpenAIProprietary
197Tencenthunyuan-turbos-20250226Tencent1301.00+/-31238TencentProprietary
198StepFunstep-1o-turbo-202506StepFun1300.00+/-24564StepFunProprietary
199OpenAIgpt-4-0125-previewOpenAI1299.00+/-812,374OpenAIProprietary
200Z.aiglm-4-plus-0111Z.ai1298.00+/-19721Z.aiProprietary
201Googlegemini-1.5-pro-001Google1297.00+/-810,492GoogleProprietary
202AIolmo-3.1-32b-thinkAi21297.00+/-26473Ai2Apache 2.0
203Alibabaqwen2.5-72b-instructAlibaba1296.00+/-85,415AlibabaQwen
204OpenAIgpt-4-turbo-2024-04-09OpenAI1296.00+/-813,217OpenAIProprietary
205Metallama-3.3-70b-instructMeta1296.00+/-85,779MetaLlama-3.3
206xAIgrok-2-2024-08-13xAI1294.00+/-78,950xAIProprietary
207Tencenthunyuan-large-2025-02-10Tencent1293.00+/-24497TencentProprietary
208DeepSeekdeepseek-v2.5-1210DeepSeek1293.00+/-171,031DeepSeekDeepSeek
209Alibabaqwen-max-0919Alibaba1291.00+/-122,249AlibabaQwen
210Tencenthunyuan-standard-2025-02-10Tencent1290.00+/-24499TencentProprietary
211Googlegemini-1.5-flash-002Google1288.00+/-94,789GoogleProprietary
212Mistralmistral-large-2407Mistral1288.00+/-86,664MistralMistral Research
213DeepSeekdeepseek-v2.5DeepSeek1288.00+/-103,649DeepSeekDeepSeek
214Z.aiglm-4-plusZ.ai1287.00+/-103,599Z.aiProprietary
215Anthropicclaude-3-5-haiku-20241022Anthropic1285.00+/-76,365AnthropicProprietary
216Mistralmagistral-medium-2506Mistral1285.00+/-26553MistralProprietary
217OpenAIgpt-4-0314OpenAI1283.00+/-107,052OpenAIProprietary
218Mistralmistral-large-2411Mistral1282.00+/-93,574MistralMRL
219Tencenthunyuan-large-visionTencent1280.00+/-30351TencentProprietary
220Tencenthunyuan-turbo-0110Tencent1279.00+/-31243TencentProprietary
221IBibm-granite-h-smallIBM1279.00+/-32358IBMApache 2.0
222Nvidiallama-3.1-nemotron-70b-instructNvidia1278.00+/-171,041NvidiaLlama 3.1
223Mistralmistral-small-3.1-24b-instruct-2503Mistral1277.00+/-132,131MistralApache 2.0
224OpenAIgpt-4o-mini-2024-07-18OpenAI1276.00+/-79,322OpenAIProprietary
225OpenAIgpt-4-0613OpenAI1275.00+/-811,181OpenAIProprietary
226OpenAIgpt-4.1-nano-2025-04-14OpenAI1274.00+/-23582OpenAIProprietary
227Alibabaqwen2-72b-instructAlibaba1273.00+/-94,835AlibabaQianwen LICENSE
228xAIgrok-2-mini-2024-08-13xAI1272.00+/-87,261xAIProprietary
229DeepSeekdeepseek-coder-v2DeepSeek1271.00+/-131,858DeepSeekDeepSeek License
230Nvidiallama-3.1-nemotron-51b-instructNvidia1271.00+/-22507NvidiaLlama 3.1
231Alibabaqwen2.5-coder-32b-instructAlibaba1270.00+/-19725AlibabaApache 2.0
232Amazonamazon-nova-pro-v1.0Amazon1269.00+/-102,978AmazonProprietary
233Metallama-3.1-70b-instructMeta1269.00+/-87,677MetaLlama 3.1 Community
234Microsoft AzurePhi 4 - 14BMicrosoft Azure1265.00+/-102,764Microsoft AzureMIT
235AIllama-3.1-tulu-3-70bAi21263.00+/-25397Ai2Llama 3.1
236Mistralmistral-small-24b-instruct-2501Mistral1261.00+/-131,683MistralApache 2.0
237NEathene-70b-0725NexusFlow1261.00+/-102,921NexusFlowCC-BY-NC-4.0
238Googlegemma-3n-e4b-itGoogle1260.00+/-151,573GoogleGemma
239Metallama-3-70b-instructMeta1257.00+/-720,941MetaLlama 3 Community
240Googlegemini-1.5-flash-001Google1257.00+/-88,392GoogleProprietary
241Googlegemma-3-4b-itGoogle1254.00+/-28423GoogleGemma
242Anthropicclaude-3-sonnet-20240229Anthropic1253.00+/-813,766AnthropicProprietary
243Nvidianemotron-4-340b-instructNvidia1252.00+/-122,352NvidiaNVIDIA Open Model
244Tencenthunyuan-standard-256kTencent1250.00+/-29361TencentProprietary
245Z.aiglm-4-0520Z.ai1247.00+/-161,191Z.aiProprietary
246REreka-core-20240904Reka AI1245.00+/-141,207Reka AIProprietary
247Googlegemma-2-27b-itGoogle1245.00+/-710,170GoogleGemma license
248AIjamba-1.5-largeAI21 Labs1245.00+/-151,147AI21 LabsJamba Open
249Amazonamazon-nova-lite-v1.0Amazon1244.00+/-112,511AmazonProprietary
250Mistralmistral-large-2402Mistral1244.00+/-97,987MistralProprietary
251Coherec4ai-aya-expanse-32bCohere1232.00+/-103,854CohereCC-BY-NC-4.0
252REreka-flash-20240904Reka AI1232.00+/-141,284Reka AIProprietary
253Anthropicclaude-3-haiku-20240307Anthropic1231.00+/-714,983AnthropicProprietary
254Coherecommand-r-plus-08-2024Cohere1230.00+/-141,467CohereCC-BY-NC-4.0
255Googlegemini-1.5-flash-8b-001Google1229.00+/-85,036GoogleProprietary
256Mistralmixtral-8x22b-instruct-v0.1Mistral1228.00+/-96,778MistralApache 2.0
257AIolmo-2-0325-32b-instructAi21227.00+/-28375Ai2Apache-2.0
258Amazonamazon-nova-micro-v1.0Amazon1224.00+/-112,455AmazonProprietary
259Alibabaqwen1.5-110b-chatAlibaba1221.00+/-113,188AlibabaQianwen LICENSE
260Mistralmistral-mediumMistral1220.00+/-114,406MistralProprietary
261Googlegemma-2-9b-itGoogle1217.00+/-87,110GoogleGemma license
262Microsoftphi-3-medium-4k-instructMicrosoft1215.00+/-113,238MicrosoftMIT
263Mistralministral-8b-2410Mistral1213.00+/-20683MistralMRL
264Alibabaqwq-32b-previewAlibaba1213.00+/-24480AlibabaApache 2.0
26501yi-1.5-34b-chat01 AI1213.00+/-112,98501 AIApache-2.0
266Coherecommand-r-plusCohere1213.00+/-89,769CohereCC-BY-NC-4.0
267REreka-flash-21b-20240226-onlineReka AI1211.00+/-142,028Reka AIProprietary
268Alibabaqwen1.5-72b-chatAlibaba1208.00+/-105,327AlibabaQianwen LICENSE
269INinternlm2_5-20b-chatInternLM1207.00+/-151,387InternLMOther
270AIllama-3.1-tulu-3-8bAi21207.00+/-26363Ai2Llama 3.1
271Coherecommand-r-08-2024Cohere1205.00+/-141,601CohereCC-BY-NC-4.0
272PRgemma-2-9b-it-simpoPrinceton1205.00+/-151,285PrincetonMIT
273OpenAIgpt-3.5-turbo-1106OpenAI1202.00+/-152,134OpenAIProprietary
274Alibabaqwen1.5-32b-chatAlibaba1200.00+/-122,649AlibabaQianwen LICENSE
275Coherec4ai-aya-expanse-8bCohere1200.00+/-151,307CohereCC-BY-NC-4.0
276OpenAIgpt-3.5-turbo-0125OpenAI1199.00+/-88,626OpenAIProprietary
277REreka-flash-21b-20240226Reka AI1198.00+/-113,363Reka AIProprietary
278Googlegemini-proGoogle1198.00+/-19993GoogleProprietary
279IBgranite-3.1-2b-instructIBM1197.00+/-26391IBMApache 2.0
280IBgranite-3.0-8b-instructIBM1196.00+/-19873IBMApache 2.0
281HUzephyr-orpo-141b-A35b-v0.1HuggingFace1196.00+/-22589HuggingFaceApache 2.0
282DAdbrx-instruct-previewDatabricks1195.00+/-114,001DatabricksDBRX LICENSE
283Googlegemini-pro-dev-apiGoogle1195.00+/-142,274GoogleProprietary
284Microsoftphi-3-mini-4k-instruct-june-2024Microsoft1193.00+/-141,568MicrosoftMIT
285Microsoftphi-3-small-8k-instructMicrosoft1193.00+/-132,092MicrosoftMIT
286Metallama-3-8b-instructMeta1192.00+/-814,252MetaLlama 3 Community
287Mistralmixtral-8x7b-instruct-v0.1Mistral1191.00+/-99,663MistralApache 2.0
288IBgranite-3.1-8b-instructIBM1190.00+/-28382IBMApache 2.0
289Metallama-3.1-8b-instructMeta1189.00+/-87,135MetaLlama 3.1 Community
290AIjamba-1.5-miniAI21 Labs1186.00+/-161,094AI21 LabsJamba Open
291Coherecommand-rCohere1175.00+/-96,682CohereCC-BY-NC-4.0
292IBgranite-3.0-2b-instructIBM1168.00+/-19908IBMApache 2.0
293Alibabaqwen1.5-14b-chatAlibaba1167.00+/-132,184AlibabaQianwen LICENSE
294Metallama-3.2-3b-instructMeta1165.00+/-161,136MetaLlama 3.2
295Googlegemma-2-2b-itGoogle1162.00+/-86,599GoogleGemma license
296SNsnowflake-arctic-instructSnowflake1162.00+/-114,793SnowflakeApache 2.0
297Googlegemma-1.1-7b-itGoogle1159.00+/-113,039GoogleGemma license
298NEstarling-lm-7b-betaNexusflow1158.00+/-141,973NexusflowApache-2.0
299OPopenchat-3.5-0106OpenChat1158.00+/-141,726OpenChatApache-2.0
300Microsoftwizardlm-70bMicrosoft1157.00+/-19903MicrosoftLlama 2 Community
301DeepSeekdeepseek-llm-67b-chatDeepSeek1155.00+/-23576DeepSeekDeepSeek License
302HUsmollm2-1.7b-instructHuggingFace1152.00+/-33271HuggingFaceApache 2.0
303NOopenhermes-2.5-mistral-7bNousResearch1151.00+/-20697NousResearchApache-2.0
30401yi-34b-chat01 AI1151.00+/-132,04301 AIYi License
305Microsoftphi-3-mini-4k-instructMicrosoft1150.00+/-122,564MicrosoftMIT
306ALtulu-2-dpo-70bAllenAI/UW1145.00+/-19888AllenAI/UWAI2 ImpACT Low-risk
307Microsoftphi-3-mini-128k-instructMicrosoft1139.00+/-132,813MicrosoftMIT
308Metallama-2-70b-chatMeta1136.00+/-104,740MetaLlama 2 Community
309Mistralmistral-7b-instruct-v0.2Mistral1127.00+/-122,605MistralApache-2.0
310UCstarling-lm-7b-alphaUC Berkeley1126.00+/-161,300UC BerkeleyCC-BY-NC-4.0
311Alibabaqwen-14b-chatAlibaba1125.00+/-24534AlibabaQianwen LICENSE
312COdolphin-2.2.1-mistral-7bCognitive Computations1124.00+/-32219Cognitive ComputationsApache-2.0
313OPopenchat-3.5OpenChat1124.00+/-18945OpenChatApache-2.0
314Metallama-3.2-1b-instructMeta1124.00+/-161,162MetaLlama 3.2
315Alibabaqwen1.5-7b-chatAlibaba1120.00+/-20690AlibabaQianwen LICENSE
316Googlegemma-7b-itGoogle1117.00+/-161,120GoogleGemma license
317LMvicuna-33bLMSYS1115.00+/-132,663LMSYSNon-commercial
318Googlepalm-2Google1114.00+/-19901GoogleProprietary
319Nvidiallama2-70b-steerlm-chatNvidia1114.00+/-27440NvidiaLlama 2 Community
320Metallama-2-13b-chatMeta1110.00+/-132,218MetaLlama 2 Community
321UPsolar-10.7b-instruct-v1.0Upstage AI1109.00+/-22604Upstage AICC-BY-NC-4.0
322Metacodellama-34b-instructMeta1108.00+/-19770MetaLlama 2 Community
323Googlegemma-1.1-2b-itGoogle1106.00+/-161,355GoogleGemma license
324MOmpt-30b-chatMosaicML1095.00+/-34242MosaicMLCC-BY-NC-SA-4.0
325NOnous-hermes-2-mixtral-8x7b-dpoNousResearch1093.00+/-21628NousResearchApache-2.0
326Metallama-2-7b-chatMeta1086.00+/-141,656MetaLlama 2 Community
327Alibabaqwen1.5-4b-chatAlibaba1085.00+/-18988AlibabaQianwen LICENSE
328TOstripedhyena-nous-7bTogether AI1084.00+/-20676Together AIApache 2.0
329LMvicuna-13bLMSYS1082.00+/-142,146LMSYSLlama 2 Community
330HUzephyr-7b-betaHuggingFace1082.00+/-171,250HuggingFaceMIT
331Mistralmistral-7b-instructMistral1081.00+/-19974MistralApache 2.0
332UWguanaco-33bUW1080.00+/-32280UWNon-commercial
333Googlegemma-2b-itGoogle1069.00+/-22597GoogleGemma license
334Microsoftwizardlm-13bMicrosoft1064.00+/-21669MicrosoftLlama 2 Community
335AIolmo-7b-instructAi21054.00+/-19848Ai2Apache-2.0
336LMvicuna-7bLMSYS1047.00+/-22658LMSYSLlama 2 Community
337TSchatglm3-6bTsinghua1041.00+/-23576TsinghuaApache-2.0
338NOgpt4all-13b-snoozyNomic AI997.00+/-37211Nomic AINon-commercial
339STalpaca-13bStanford990.00+/-23652StanfordNon-commercial
340MOmpt-7b-chatMosaicML984.00+/-25471MosaicMLCC-BY-NC-SA-4.0
341RWRWKV-4-Raven-14BRWKV982.00+/-24544RWKVApache 2.0
342UCkoala-13bUC Berkeley979.00+/-21751UC BerkeleyNon-commercial
343TSchatglm-6bTsinghua976.00+/-25525TsinghuaNon-commercial
344TSchatglm2-6bTsinghua971.00+/-35227TsinghuaApache-2.0
345OPoasst-pythia-12bOpenAssistant959.00+/-22687OpenAssistantApache 2.0
346DAdolly-v2-12bDatabricks949.00+/-29370DatabricksMIT
347LMfastchat-t5-3bLMSYS919.00+/-26462LMSYSApache 2.0
348Metallama-13bMeta918.00+/-33252MetaNon-commercial
349STstablelm-tuned-alpha-7bStability AI890.00+/-29353Stability AICC-BY-NC-SA-4.0