Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Google Gemini now transcribes audio files

The new feature processes up to 10 minutes of recordings, turning voice memos, meetings, and lectures into searchable text.

byAytun Çelebi
September 11, 2025
in Artificial Intelligence
Home News Artificial Intelligence
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

Google’s Gemini AI assistant now allows audio file uploads, enabling users to transcribe, summarize, and extract key information from recordings. This new feature converts up to 10 minutes of voice memos, meetings, lectures, and interviews into searchable documents directly within the AI environment.

Audio file uploads are supported on both the web and mobile applications. Users can access the feature through the standard file-upload interface. This differs from Gemini Live’s real-time voice command processing, as the new function processes pre-recorded audio for data extraction and analysis.

Josh Woodward, Google’s VP of Gemini, stated that audio file upload was the most requested feature from Gemini users. This demand highlights a need for streamlined audio processing within the AI assistant.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Transcription accuracy and feature integration

During testing, Gemini accurately transcribed various audio types, including comedy album sketches and phone conversations, with only minor errors in name recognition. The system also effectively identified key elements and generated to-do lists from the audio content.

The addition of audio processing aligns with recent Gemini integrations, such as implementations into various apps, testing of a card-based visual interface, and expanded personalization options. These updates collectively enhance Gemini’s functionality and user experience.

Comparison with other AI assistants

While Gemini’s audio capabilities are not unique, they are comparable to features from competitors like ChatGPT, which uses its Whisper transcription model. Anthropic’s Claude also supports audio processing in certain developer tools, and Perplexity can extract data from YouTube videos. Gemini aims to focus on everyday use cases for a broad user base.

Advanced audio data processing

Beyond simple transcription, Gemini allows users to request language simplification, extract speaker-specific comments, generate questions from audio content, or create study guides from recorded discussions. These options provide tools to efficiently manipulate and repurpose audio information.

Limitations of the audio feature

The current 10-minute limit on audio file uploads restricts its applicability for longer recordings. Free-tier users also face daily usage limits on audio processing. These limitations may impact users with extensive audio processing needs.

Google has not released specific pricing for high-volume audio processing. However, audio processing is integrated into the regular Gemini quota. This suggests users should manage their usage to avoid exceeding allocated resources.


Featured image credit

Tags: google gemini

Related Posts

OpenAI improves health responses for free ChatGPT users

OpenAI improves health responses for free ChatGPT users

June 19, 2026
Steam Next Fest sees one in five demos labeled for generative AI

Steam Next Fest sees one in five demos labeled for generative AI

June 17, 2026
Anthropic adds multilingual and push-to-talk features to Claude Voice Mode

Anthropic adds multilingual and push-to-talk features to Claude Voice Mode

June 17, 2026
Is Gemini down? Users report problems with Google Gemini

Is Gemini down? Users report problems with Google Gemini

June 17, 2026
The Atlantic uncovers millions of copyrighted songs in AI training data

The Atlantic uncovers millions of copyrighted songs in AI training data

June 16, 2026
Meta brings AI-powered photo editing and chat features to Facebook

Meta brings AI-powered photo editing and chat features to Facebook

June 16, 2026

LATEST NEWS

OpenAI improves health responses for free ChatGPT users

Adobe expands Firefly AI across Premiere, Illustrator, InDesign and Frame.io

Spotify launches Reserved to give superfans early ticket access

Google discontinues Nest Home Mini and Nest Audio

Instagram adds unique captions for each carousel slide

Steam Next Fest sees one in five demos labeled for generative AI

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Novoresume

PolyAI

SeaArt

H2O.ai

Techpresso

Namecheap Free Logo Maker

Binaural Beats Factory

Lyricallabs

Jobscan

Vsub

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.