开源领域大语言模型再上台阶：Databricks开源1320亿参数规模的混合专家大语言模型DBRX-16×12B，评测表现超过Mixtral-8×7B-MoE，免费商用授权！

基于混合专家技术的大语言模型是当前大语言模型的一个重要方向。去年MistralAI开源了全球最有影响力的Mixtal-8×7B-MoE模型，吸引了很多关注。在2024年3月27日的今天，Databricks宣布开源一个全新的1320亿参数的混合专家大语言模型DBRX。

DBRX是Databricks开源的一个transformer架构的大语言模型。包含1320亿参数，共16个专家网络组成，每次推理使用其中的4个专家网络，激活了360亿参数。

它与业界著名的混合专家网络模型对比结果如下：

| 模型信息 | 总参数数量 | 专家网络数目 |推理使用参数量 | 模型信息卡地址| | ------------ | ------------ | ------------ | ------------ | |DBRX | 1320 | 16 | 360 | https://www.datalearner.com/ai-models/pretrained-models/DBRX-Instruct| |Mixtral-8×7B-MoE | 467 | 8 | 120 | https://www.datalearner.com/ai-models/pretrained-models/Mistral-7B-MoE | | Grok-1 | 3140 | 8 | 860 | |DeepSeekMoE-16B | 164 | 8 | 28| |

Model	DBRX Instruct	Mixtral Instruct	GPT-3.5 Turbo (API)	GPT-4 Turbo (API)
Answer in Beginning Third of Context	45.1%	41.3%	37.3%*	49.3%
Answer in Middle Third of Context	45.3%	42.7%	37.3%*	49.0%
Answer in Last Third of Context	48.0%	44.4%	37.0%*	50.9%
2K Context	59.1%	64.6%	36.3%	69.3%
4K Context	65.1%	59.9%	35.9%	63.5%
8K Context	59.5%	55.3%	45.0%	61.5%
16K Context	27.0%	20.1%	31.7%	26.0%
32K Context	19.9%	14.0%	—	28.5%

Model	DBRX Instruct	Mixtral Instruct	LLaMa2-70B Chat	GPT 3.5 Turbo (API)	GPT 4 Turbo (API)
Natural Questions	60.0%	59.1%	56.5%	57.7%	63.9%
HotPotQA	55.0%	54.2%	54.7%	53.0%	62.9%