Stability AI unveiled a groundbreaking AI music generator, Stable Audio on September 13, 2023. While the AI music generation sphere may already be familiar with the likes of OpenAI Jukebox, Stability AI—having partnered with AudioSparx—aims to offer something fresh and compelling in this domain.
This is not a first rodeo for Stability AI in the realm of artificial intelligence. Last year, the company introduced Dance Diffusion, an AI solution designed to create songs and sound effects based on user-provided prompts. Despite its ingenuity, Dance Diffusion was left in its prototype phase as the R&D team pivoted to focus on their newly minted music generator.
As for the technology behind it, we can confirm that Stable Audio uses cutting-edge audio diffusion AI models to generate music. Aimed at both individual and commercial applications, this new tool seeks to redefine what’s possible in the AI-generated music landscape.
What is Stable Audio: Stability AI unveils its music generator
Just days after its launch, Stable Audio, the innovative Music Generator by Stability AI, is already capturing widespread attention and praise. This marks a pivotal moment for Stability AI, particularly given the incomplete trajectory of their earlier venture, Dance Diffusion.
Unveiling Stable Audio, the company has set a new standard in AI-generated music. Engineered to produce original, high-quality audio at 44.1 kHz stereo, the technology underpinning it is nothing short of revolutionary. As officially stated, Stable Audio employs “a latent diffusion for audio model, trained on data from AudioSparx, a leading music library.”
Ed Newton-Rex, the VP of audio for Stability AI, recently spoke with TechCrunch, elucidating the company’s broader goals.
“Stability AI is on a mission to unlock humanity’s potential by building foundational AI models across a number of content types or ‘modalities. We started with Stable Diffusion and have grown to include languages, code and now music. We believe the future of generative AI is multimodality.”
When it comes to the data used to train Stable Audio, the numbers speak for themselves. With a training set comprising over 800,000 audio files—including music, single-instrument stems, sound effects, and text metadata—all supplied by AudioSparx, the dataset encompasses more than 19,500 hours of audio.
Contrary to rumors suggesting that Stable Audio was a product of Harmonai, the reality is that Harmonai operates as the music research arm of Stability AI. Stability’s dedicated audio team was actually formed in April, drawing inspiration from Dance Diffusion to bring forth the marvel that is Stable Audio.
In his TechCrunch interview, Newton-Rex contrasted Stable Audio with its predecessor, Dance Diffusion:
“Dance Diffusion generated short, random audio clips from a limited sound palette, and the user had to fine-tune the model themselves if they wanted any control. Stable Audio can generate longer audio, and the user can guide generation using a text prompt and by setting the desired duration,”. He also claimed, “Some prompts work fantastically, like EDM and more beat-driven music, as well as ambient music, and some generate audio that’s a bit more ‘out there,’ like more melodic music, classical and jazz.”
How to use Stable Audio?
Follow these steps to get started in minutes:
- Navigate to Stable Audio’s official website to ensure you’re accessing the platform securely and getting the full range of features.
- Locate and click the “Try it out for free” button, which is typically found in the upper-right corner of the homepage, to start your AI music journey.
- Sign in with a valid email address or conveniently use your Google account. Make sure the details are accurate to avoid future login issues.
- Review and accept the Terms of Service to proceed. Optionally, you can choose to subscribe to Stable Audio’s newsletter to keep abreast of updates, tips, and promotions.
- Once you’re logged in, you’ll find yourself on the main dashboard. Stable Audio offers helpful hints on the left-hand pane to guide you on the types of prompts you can use for generating music.
- To generate music, simply enter your chosen prompt and sound characteristics into the designated field. For example, you could input something like “Heavy metal, thrash, headbanging, concert promotion, shredding guitar, aggressive, 180 bpm”
- After inputting your prompt, click on the arrow button to initiate the music generation process. A short wait later, you can listen to the audio outcome and assess if it meets your creative vision.
- Below, you can find our own creation, you can click on the play button to get a sense of what Stable Audio can achieve:
How to enter Stable Audio prompts like a pro?
Delve into Stable Audio’s prompt system with an expert mindset, tailoring each command to meet your creative needs. To maximize your output, consider these tips:
Specify the details
Whether you’re envisioning a specific genre or a nuanced mood, make it clear. The more detailed your prompt, the closer the output will align with your artistic vision.
Dictate the atmosphere
Stable Audio—Stability AI’s Music Generator—lets you articulate mood preferences directly in your prompt. Want something upbeat, soulful, or perhaps meditative? Just say so, and the AI will oblige.
Handpick your instruments
Fancy the richness of Swelling Strings or the timbre of Reverberated Guitars? Stable Audio suggests “adjectives can be a big plus when naming instruments.” Be as specific as you can to guide the AI in fulfilling your musical preferences.
Calibrate the tempo
Mind the beats per minute (BPM) when you want to control both the pace and the genre of your music. This ensures your generated piece is not just a reflection of your taste but also meets your time requirements.
Kits.ai wants to be an all-in-one toolkit to supercharge your music
Pricing of Stable Audio
Stable Audio offers a complimentary version for aspiring creators. While the free version limits the scope of your music production, it serves as an excellent gateway to explore the capabilities of this revolutionary Music Generator.
Pricing Tier | Cost | Monthly Track Generations | Track Duration | License |
Free | It’s free. Get started! | 20 | Up to 45 seconds | Non-commercial use |
Professional | $11.99 a month | 500 | Up to 90 seconds | Commercial use |
Enterprise | Custom amount | Custom | Custom | Commercial use |
If the trial leaves you craving more, consider ascending to the ‘Professional’ tier. Priced at a reasonable $11.99 per month, this package gifts you with the freedom to produce 500 monthly track generations, each with a maximum duration of 90 seconds. As an added perk, a Commercial Use License is bundled in, making it ideal for small to medium-sized businesses.
For larger organizations seeking a more tailored experience, Stable Audio presents its ‘Enterprise’ package. A custom offering tailored to organizational needs, it allows businesses to fully harness the technology’s capabilities on a grand scale.
By offering these diverse pricing options, Stable Audio enables a wide audience—from novices to seasoned professionals—to engage with the platform. This flexible pricing strategy not only democratizes access to high-quality AI-generated music but also empowers users to select a package that best suits their creative needs and budget constraints.
Featured image credit: Kerem Gülen/Midjourney