There is a new way to create AI sounds! Stability AI has introduced Stable Audio Open, an exciting new tool that uses text descriptions to generate audio clips. From drum beats to ambient sounds, you can now create a wide range of audio elements with just a few words.
Stable Audio Open and Stable Audio may sound similar but cater to different needs. Stable Audio is a paid service for professionals who want to create full songs and high-quality music for commercial use. However, Stable Audio Open is free and makes short audio clips from text descriptions, perfect for simple projects. Want to start one of them? Here is all you need to know.
What is Stable Audio Open?
Stable Audio Open is a cutting-edge generative AI model developed by Stability AI, designed to produce sounds and short audio clips based on text descriptions. This innovative tool allows users to create various audio elements, ranging from drum beats to ambient noises, by simply inputting descriptive text.
🎵 Stable Audio Open 1.0 generates variable-length (up to 47s) stereo audio at 44.1kHz from text prompts. 💃🕺 Jupyter Notebook 🥳
Thanks to @StabilityAI Stable Audio Team ❤
🌐page: https://t.co/wvXRhx0AkK
🧬code: https://t.co/FqMAGtH3ad
🍊jupyter: please try it 🐣… https://t.co/2MoK0Yd2MZ pic.twitter.com/bMsVEsJYlV— camenduru (@camenduru) June 5, 2024
The core functionality of Stable Audio Open lies in its ability to transform textual descriptions into audio recordings. Here’s a step-by-step breakdown of how it operates:
- Text input: Users provide a text description of the desired sound. For example, “Rock beat played in a treated studio, session drumming on an acoustic kit.”
- AI processing: The model processes the text input using advanced natural language processing (NLP) techniques to understand the desired attributes of the sound, such as genre, instruments, and environment.
- Audio generation: Using its trained neural network, the model generates an audio clip up to 47 seconds in length that matches the input description.
Stable Audio Open was trained on a substantial dataset consisting of around 486,000 samples from royalty-free music libraries, specifically FreeSound and the Free Music Archive. This extensive training allowed the model to learn a diverse array of sounds and musical elements, enhancing its ability to generate high-quality audio from textual descriptions.
What can you do with Stable Audio Open?
Here is a quick look:
- Versatile audio creation: The model can create a variety of audio types, including drum beats, instrumental riffs, ambient sounds, and production elements suitable for multimedia projects like videos, films, and TV shows.
- Style transfer and editing: Users can “edit” existing songs or apply the style of one genre to another. For example, incorporating smooth jazz elements into a rock track.
- Custom fine-tuning: A unique aspect of Stable Audio Open is its open-source nature, which allows users to fine-tune the model with their own custom audio data. This enables personalization and the creation of sounds tailored to specific needs or styles. For instance, a musician can input their own recordings to generate new variations.
You can check the early Stable Audio Open examples here.
What can’t you do with Stable Audio Open?
While Stable Audio Open offers significant capabilities, it has some notable limitations:
- Incomplete songs and vocals: The model is not optimized for creating full songs, complex melodies, or high-quality vocal tracks. Users looking for these advanced features are directed to Stability AI’s premium Stable Audio service.
- Non-commercial use: The terms of service for Stable Audio Open prohibit commercial use. This means it is intended for personal, educational, or experimental purposes rather than commercial projects.
- Bias and representation: The training data, while extensive, may not equally represent all musical styles and cultures. This can result in biases in the generated audio, particularly for non-English descriptions or underrepresented musical genres.
How to use Stable Audio Open
Stable Audio Open is available on Hugging Face, a popular platform for artificial intelligence models. Once on the model’s page, you will find options to download the model weights. These weights are essential for running the model locally or integrating it into your own applications.
Recap
In summary, Stability AI’s launch of Stable Audio Open is a big step forward in making AI-generated sounds easy and accessible. With this tool, you can create all sorts of audio, like drum beats and ambient sounds, just by typing in words. While Stable Audio Open and Stable Audio seem alike, they’re actually different. Stable Audio Open is free and great for simple projects, while Stable Audio, a paid service, is for pros making full songs and top-notch music for businesses.
So whether you’re new to this or a pro, Stable Audio Open is here to help you make cool sounds!
Featured image credit: