GPT-J 6B: the open source model that matches GPT-3 Curie on many benchmarks
In one sentence EleutherAI releases GPT-J, a 6B-parameter model trained in JAX on TPUs, performance comparable to GPT-3 Curie, shipped under Apache 2.0.
EleutherAI takes another step: it releases GPT-J, a 6 billion parameter model, free and permissively licensed. About 5x bigger than GPT-Neo 2.7B, and on many tests it approaches or beats GPT-3 Curie (OpenAI's second-largest model at the time).
The remarkable part is that it was trained by a single researcher (Ben Wang) on TPUs donated by Google via the TRC program. No private data center, no millions in GPUs.
For the open source community it becomes the reference model for fine-tuning, chatbots, experiments. It stays the "best open" until Llama lands in 2023.
Companies
EleutherAI
Tools
GPT-J
Tags
Sources