Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Sesame’s AI voice is so real, it’s unsettling

The core innovation behind Sesame’s CSM lies in its ability to simulate natural, dynamic conversation. Unlike traditional text-to-speech systems that simply read aloud, CSM actively engages

byKerem Gülen
March 5, 2025
in Artificial Intelligence, News

A new AI voice model has set the internet abuzz, with reactions oscillating between awe and unease. Sesame AI’s Conversational Speech Model (CSM) doesn’t just sound human—it feels human. Users describe extended, almost emotional interactions with the AI-generated voices, which exhibit breath sounds, hesitations, corrections, and even chuckles. For some, it’s a technological marvel. For others, it’s a glimpse into a future that feels uncomfortably close.

Sesame AI: A voice that feels alive

The core innovation behind Sesame’s CSM lies in its ability to simulate natural, dynamic conversation. Unlike traditional text-to-speech systems that simply read aloud, CSM actively engages. It stumbles over words, corrects itself, and modulates tone in a way that mimics real human unpredictability.

https://image.ai-anime-generator.icu/sesame.mp4

When one tester spoke to the model for 28 minutes, they noted its ability to debate moral topics, reacting naturally to prompts like, “How do you decide what’s right or wrong?” Others found themselves unintentionally forming attachments, with one Reddit user admitting, “I’m almost a bit worried I will start feeling emotionally attached to a voice assistant with this level of human-like sound.”

Sesame’s AI assistants, dubbed “Miles” and “Maya,” are designed not just for information retrieval but for deep, engaging conversations. The company describes its goal as achieving “voice presence”—the magical quality that makes spoken interactions feel real, understood, and valued.

That realism sometimes leads to oddly human quirks. In one viral demo, the AI casually mentioned craving a peanut butter and pickle sandwich—a bizarrely specific comment that only added to the illusion of personality.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.


Did you create your TikTok AI voice?


The tech behind the voice

So how does Sesame’s CSM achieve such eerily lifelike conversations?

  • A multimodal approach: Unlike conventional AI speech models that process text and audio separately, Sesame’s system interleaves them. This single-stage processing allows for more fluid, context-aware speech.
  • High-parameter training: The largest version of the model runs on 8.3 billion parameters and was trained on over one million hours of spoken dialogue.
  • Meta’s influence: The model’s architecture builds upon Meta’s Llama framework, integrating a backbone model with a decoder for nuanced speech generation.

Blind tests have revealed that, in isolated speech samples, human evaluators couldn’t reliably distinguish Sesame’s AI voices from real ones. However, when placed in full conversational context, human speech still won out—suggesting AI has not yet mastered the full complexity of interactive dialogue.

A mixed reception

Not everyone is thrilled by how human this AI sounds.

Technology journalist Mark Hachman described his experience with the voice model as “deeply unsettling.” He compared it to talking with an old friend he hadn’t seen in years, noting that the AI’s voice bore an eerie resemblance to someone he had once dated.

Others have likened Sesame’s model to OpenAI’s Advanced Voice Mode for ChatGPT, with some preferring Sesame’s realism and willingness to roleplay in more dramatic or even angry scenarios—something OpenAI’s models tend to avoid.

One particularly striking demo showcased the AI arguing with a “boss” over an embezzlement scandal. The conversation was so dynamic that listeners struggled to determine which speaker was the human and which was the AI.

The risks of a perfect voice

As with all AI breakthroughs, hyper-realistic voice synthesis brings both promise and peril.

  • Fraud & scams: With AI voices now indistinguishable from human speech, voice phishing scams could become far more convincing. Criminals could impersonate family members, corporate executives, or government officials with near-perfect accuracy.
  • Social engineering: Unlike basic robocalls, AI-powered deception could adapt in real time, responding naturally to questions and suspicion.
  • Unintended emotional impact: Some users have reported their children forming attachments to the AI voices. One parent noted that their 4-year-old cried after being denied further conversation with the model.

While Sesame’s CSM does not clone real voices, the possibility of similar open-source projects emerging remains a concern. OpenAI has already delayed the wider release of its voice technology over fears of misuse.

What’s next?

Sesame AI plans to open-source key components of its research under the Apache 2.0 license, allowing developers to build upon its work. The company’s roadmap includes:

  • Scaling up model size to increase realism further.
  • Expanding to 20+ languages, broadening its conversational reach.
  • Developing “fully duplex” models, enabling true back-and-forth, interruption-capable conversations.

For now, the demo remains available on Sesame’s website—though demand has already overwhelmed their servers at times. Whether you find it astonishing or unsettling, one thing is clear: the days of robotic, monotone AI voices are over.

From here on, you may never be quite sure who—or what—you’re talking to.


Featured image credit: Kerem Gülen/Imagen 3

Tags: AIFeaturedsesame

Related Posts

Netflix to stream video podcasts in 2026

Netflix to stream video podcasts in 2026

November 6, 2025
Google Maps integrates Gemini for hands-free navigation

Google Maps integrates Gemini for hands-free navigation

November 6, 2025
Sony unlocks PS5 game streaming on Portal for PS Plus Premium users

Sony unlocks PS5 game streaming on Portal for PS Plus Premium users

November 6, 2025
Sony launches world’s first ethical bias benchmark for AI images

Sony launches world’s first ethical bias benchmark for AI images

November 6, 2025
Nintendo expands its store app beyond Japan to global markets

Nintendo expands its store app beyond Japan to global markets

November 6, 2025
Blue Origin New Glenn’s second launch set for November 9

Blue Origin New Glenn’s second launch set for November 9

November 6, 2025

LATEST NEWS

Netflix to stream video podcasts in 2026

Google Maps integrates Gemini for hands-free navigation

Sony unlocks PS5 game streaming on Portal for PS Plus Premium users

Sony launches world’s first ethical bias benchmark for AI images

Nintendo expands its store app beyond Japan to global markets

Blue Origin New Glenn’s second launch set for November 9

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.