DeepSeek-Coder v1: China enters the open source coding model race
In one sentence DeepSeek releases coding models from 1B to 33B parameters trained on 2 trillion tokens with advanced FIM training, topping HumanEval among all open-weight models.
Until mid-2023, the best AI models for writing code all came from American companies like OpenAI and Google. DeepSeek, a Chinese company, changed this with the release of DeepSeek-Coder.
DeepSeek trained a family of models of different sizes, from the smallest at 1 billion parameters to the large 33 billion parameter version, using an enormous amount of code: 2 trillion tokens. Of this corpus, 87% was pure code and the remaining 13% was natural language text to help the model understand instructions.
A particular technique called "fill-in-the-middle" (FIM) made these models especially good at completing code in the middle of a file, not just at the end — a fundamental feature for the autocomplete tools developers use every day. At release, the 33B model outperformed all other open source models on major benchmarks. DeepSeek-Coder signaled that AI competition was no longer a purely Western affair.
Companies
DeepSeek
Tools
—
Tags
Sources