Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

AudioLM indiscernibly mimics speech and music

by Önder Erdine
September 27, 2022
in News, Artificial Intelligence
Home News
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
  • Google’s research division has launched AudioLM, a framework for creating high-quality audio that retains consistency across time.
  • The most amazing part is that it does so without prior transcripts or annotations, despite the fact that the generated speech is syntactically and semantically acceptable.
  • Furthermore, it keeps the speaker’s identity and prosody to the point that the listener cannot discern which portion of the audio is genuine and which was generated by artificial intelligence.
  • The most crucial feature of AudioLM’s artificial intelligence is its ability to accomplish several tasks at once, not only repeat talks and tunes.
  • AudioLM is not yet publicly accessible; it is only a language model that may be applied in a variety of applications.

We showed them chess games, and they quickly became unbeatable opponents; we let them read our texts, and they soon began to write. They also learned to paint and do photo edits. Was there anyone who doubted that artificial intelligence could do the same with speeches and music?

Table of Contents

  • Google’s AudioLM performs miracles both with speech and music
  • AudioLM was trained in semantics and acoustics
  • Is AI becoming more dangerous by the day?

Google’s AudioLM performs miracles both with speech and music

Google’s research group has launched AudioLM, a framework for producing high-quality audio that maintains consistency across time. To do this, it begins with a recording that is just a few seconds long and is capable of extending it naturally and logically.

The most impressive aspect is that it does so without being taught with previous transcripts or annotations, despite the fact that the created speech is syntactically and semantically reasonable. Furthermore, it preserves the speaker’s identity and prosody to the point that the listener is unable to determine which piece of the audio is genuine and which was created by artificial intelligence.

Google’s new AI, AudioLM, can almost perfectly mimic speech and music
The new AI mimics not only the speech but also the background noise as well

The applications of artificial intelligence are astounding. It can not only mimic articulation, pitch, timbre, and intensity, but it can also introduce the sound of the speaker’s breath and make understandable phrases. If it’s not from a studio but rather from a recording with background noise, AudioLM mimics it to ensure continuity. More examples are available on the AudioLM website.


Join the Partisia Blockchain Hackathon, design the future, gain new skills, and win!


AudioLM was trained in semantics and acoustics

The creation of audio or music is not a new phenomenon. However, it is the approach taken by Google researchers to solve the issue. Semantic indicators (phonemes, lexicon, semantics…) and acoustic markers (speaker identity, recording quality, background noise…) are collected from each audio to encode a high-level structure (phonemes, lexicon, semantics…).

With this data already processed and intelligible for AI, AudioML starts its job by constructing a hierarchy in which it predicts semantic markers first, which are subsequently utilized as constraints to forecast acoustic markers. The latter is employed once more at the end to turn the bits into something we can hear.

Google’s new AI, AudioLM, can almost perfectly mimic speech and music
AudioLM is considerably better at continuing piano compositions when compared to models trained using auditory markers

This semantic separation and hierarchy of acoustics are not just useful for training language models that create speech. It is also more successful for continuing piano compositions, according to the researchers, as demonstrated on their website. It outperforms models that are exclusively trained using auditory markers.


France starts using artificial intelligence to discover taxable swimming pools


The most important aspect of AudioLM’s artificial intelligence is that it can perform everything at once, not only repeat speeches and melodies. It is, therefore, a single language model that can be used for text-to-speech — a robot might read entire novels and replace professional voice actors — or to enable any gadget to speak with humans using a familiar voice. Amazon has already investigated the possibility of utilizing the voice of loved ones in its Alexa devices.

Is AI becoming more dangerous by the day?

Programs like DALL-E 2 and Stable Diffusion are excellent tools for quickly sketching ideas or generating creative materials. Audio may be much more significant, and one may see firms using an announcer’s voice on demand. The voices of departed actors might even be used in dubbing films.

Google’s new AI, AudioLM, can almost perfectly mimic speech and music
While this AI can produce incredibly convincing speech, this also poses the risk of manufactured speeches

You may be thinking this idea, while thrilling, is also risky. Any audio recording can be tampered with for political, legal, or judicial objectives. According to Google, while people have difficulties distinguishing between what comes from man and what comes from artificial intelligence, a computer can discern whether the audio is organic or not. Not only that machines might replace us, but another machine will be required to appraise their job.


Artificial intelligence jobs are in high demand: Here are the career paths


AudioLM is not yet available to the public; it is only a language model that may be implemented into various applications. However, this example, along with OpenAI’s Jukebox music software, highlights how swiftly we’re entering a new world where no one will ever know, or care, if that photo was shot by a person or if there’s someone on the other end of the phone.

Tags: artificial intelligenceAudioLMGoogleMachine Learning

Related Posts

What is the Microsoft Loop app, and how to access it? We explained everything you need to know about the new Notion rival. Keep reading...

Microsoft Loop is here to keep you always in sync

March 23, 2023
Can artificial intelligence have consciousness

Exploring the mind in the machine

March 23, 2023
Adobe Firefly AI: See ethical AI in action

Adobe Firefly AI: See ethical AI in action

March 22, 2023
Runway AI Gen-2 makes text-to-video AI generator a reality

Runway AI Gen-2 makes text-to-video AI generator a reality

March 21, 2023
We explained how to use Microsoft 365 Copilot in Word, PowerPoint, Excel, Outlook, Teams, Power Platform, and Business Chat. Check out!

Microsoft 365 Copilot is more than just a chatbot

March 20, 2023
Can Komo AI be the alternative to Bing?

Can Komo AI be the alternative to Bing?

March 17, 2023

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

LATEST ARTICLES

Microsoft Loop is here to keep you always in sync

Exploring the mind in the machine

Adobe Firefly AI: See ethical AI in action

A holistic perspective on transformational leadership in corporate settings

Runway AI Gen-2 makes text-to-video AI generator a reality

Maximizing the benefits of CaaS for your data science projects

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy
  • Partnership
  • Writers wanted

Follow Us

  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.