Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Implicit caching aims to slash Gemini API costs by 75%

This new system automatically enables cost savings when a Gemini API request to the 2.5 Pro or 2.5 Flash models shares a common prefix with a prior request.

byKerem Gülen
May 9, 2025
in Artificial Intelligence, News
Home News Artificial Intelligence
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

Google has launched a new feature in its Gemini API called “implicit caching,” which the company claims can reduce costs by 75% for third-party developers using its latest AI models, Gemini 2.5 Pro and 2.5 Flash.

The feature automatically enables cost savings when a Gemini API request to a model hits a cache, eliminating the need for manual configuration required by the previous explicit caching method. According to Google, implicit caching is triggered when a request shares a common prefix with a previous request, and the minimum prompt token count required is 1,024 for 2.5 Flash and 2,048 for 2.5 Pro.

Logan Kilpatrick, a member of the Gemini team, announced the launch on May 8, 2025, stating that the feature can deliver significant cost savings for developers. Google recommends that developers place repetitive context at the beginning of requests and append changing context at the end to increase the chances of implicit cache hits.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Caching is a widely adopted practice in the AI industry that reuses frequently accessed or pre-computed data to cut down on computing requirements and costs. Google’s previous explicit caching method required developers to define high-frequency prompts manually, which often resulted in extra work and sometimes surprisingly large API bills for some users.

Some developers had expressed dissatisfaction with the explicit caching implementation for Gemini 2.5 Pro, prompting the Gemini team to apologize and pledge to make changes. The new implicit caching feature addresses these concerns by automating the caching process and passing on cost savings to developers when a cache hit occurs.

While Google claims that implicit caching can deliver 75% cost savings, the company did not provide third-party verification of the feature’s effectiveness. As such, the actual cost savings may vary depending on how developers use the feature.


Featured image credit

Tags: APIgeminiGoogle

Related Posts

India mandates continuous SIM binding for WhatsApp and Telegram

India mandates continuous SIM binding for WhatsApp and Telegram

December 15, 2025
Amazon launches Ask this Book AI feature for Kindle iOS app

Amazon launches Ask this Book AI feature for Kindle iOS app

December 15, 2025
Uber launches YOUBER year-in-review for US users

Uber launches YOUBER year-in-review for US users

December 15, 2025
Rivian announces home-grown AI assistant coming to all R1 vehicles in 2026

Rivian announces home-grown AI assistant coming to all R1 vehicles in 2026

December 15, 2025
Google wipes Disney AI videos from YouTube following legal threat

Google wipes Disney AI videos from YouTube following legal threat

December 15, 2025
OpenAI exec says your typing speed is the main bottleneck to AGI

OpenAI exec says your typing speed is the main bottleneck to AGI

December 15, 2025

LATEST NEWS

India mandates continuous SIM binding for WhatsApp and Telegram

Amazon launches Ask this Book AI feature for Kindle iOS app

Uber launches YOUBER year-in-review for US users

Rivian announces home-grown AI assistant coming to all R1 vehicles in 2026

Google wipes Disney AI videos from YouTube following legal threat

OpenAI exec says your typing speed is the main bottleneck to AGI

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.