Reading level
EnCodec is an audio codec that uses neural networks instead of classical algorithms: it takes a sound, compresses it into very small codes, and reconstructs it with high fidelity. It works at very low bitrates — as low as 1.5 kbps for mono audio, less than an SMS — while maintaining perceptual quality above traditional codecs like Opus or EVS. The most important part for AI is the RVQ structure (Residual Vector Quantization): audio is represented as sequences of discrete tokens, perfect for use by language models. That is why EnCodec became the de facto vocoder for systems like AudioLM, SoundStorm, VALL-E and Voicebox.
Companies
Meta AI
Tools
EnCodec, RVQ
Tags
EnCodecNeural CodecAudio CompressionMeta AIRVQVocoder
Sources