OpenAI Jukebox: generating whole songs with vocals
In one sentence OpenAI releases Jukebox, a generative model that produces raw songs (audio + vocals + lyrics) conditioned on artist and genre, built on a stack of VQ-VAE and autoregressive transformers.
Until then, music-generation models mostly produced MIDI sequences or short instrumental clips. OpenAI tries something more ambitious: generate raw audio, complete with sung vocals and words.
It's called Jukebox. You give it an artist name (Frank Sinatra, Elvis, Kanye West), a genre, and a few lines of lyrics, and it returns a two-minute song. It's not perfect — the voice is "wet", words sometimes wrong — but it's something that didn't exist before.
For musicians, it's the first taste of a future where generative tools enter the studio. For OpenAI, a demonstration that transformers work well outside text too.
Companies
OpenAI
Tools
Jukebox
Tags
Sources