Imagine a professional musician being able to explore new compositions without having to play a single note on an instrument. Or a small business owner adding a soundtrack to their latest video ad on Instagram with ease. That’s the promise of AudioCraft — our latest AI tool that generates high-quality, realistic audio and music from text.
AudioCraft consists of three models: MusicGen, AudioGen and EnCodec. MusicGen, which was trained with Meta-owned and specifically licensed music, generates music from text prompts, while AudioGen, which was trained on public sound effects, generates audio from text prompts. Today, we’re excited to release an improved version of our EnCodec decoder, which allows higher quality music generation with fewer artifacts. We’re also releasing our pre-trained AudioGen models, which let you generate environmental sounds and sound effects like a dog barking, cars honking, or footsteps on a wooden floor. And lastly, we’re sharing all of the AudioCraft model weights and code.
We’re open-sourcing these models, giving researchers and practitioners access so they can train their own models with their own datasets for the first time, and help advance the field of AI-generated audio and music.
To continue reading this article, click here.