DBRX: Databricks's 132B-total / 36B-active open MoE
In one sentence Databricks releases DBRX, an open-weights Mixture-of-Experts with 132B total parameters (36B active per token), beating Llama 2 70B on many benchmarks at lower inference cost.
Databricks, the company behind Apache Spark and the recent buyer of MosaicML, publishes a large model's weights for free, called DBRX.
The technical twist: it's a "Mixture of Experts." Think of a team of 16 specialists — for each word to generate, the system picks the 4 most relevant and only those work. Total size is huge (132 billion parameters) but only 36 are used per token, so it's faster and cheaper at inference.
It's downloadable under an open license; Databricks also serves it via API on their platform.
Companies
Databricks, Mosaic
Tools
DBRX
Tags
Sources