Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Implicit caching aims to slash Gemini API costs by 75%

This new system automatically enables cost savings when a Gemini API request to the 2.5 Pro or 2.5 Flash models shares a common prefix with a prior request.

byKerem Gülen
May 9, 2025
in Artificial Intelligence, News
Home News Artificial Intelligence
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

Google has launched a new feature in its Gemini API called “implicit caching,” which the company claims can reduce costs by 75% for third-party developers using its latest AI models, Gemini 2.5 Pro and 2.5 Flash.

The feature automatically enables cost savings when a Gemini API request to a model hits a cache, eliminating the need for manual configuration required by the previous explicit caching method. According to Google, implicit caching is triggered when a request shares a common prefix with a previous request, and the minimum prompt token count required is 1,024 for 2.5 Flash and 2,048 for 2.5 Pro.

Logan Kilpatrick, a member of the Gemini team, announced the launch on May 8, 2025, stating that the feature can deliver significant cost savings for developers. Google recommends that developers place repetitive context at the beginning of requests and append changing context at the end to increase the chances of implicit cache hits.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Caching is a widely adopted practice in the AI industry that reuses frequently accessed or pre-computed data to cut down on computing requirements and costs. Google’s previous explicit caching method required developers to define high-frequency prompts manually, which often resulted in extra work and sometimes surprisingly large API bills for some users.

Some developers had expressed dissatisfaction with the explicit caching implementation for Gemini 2.5 Pro, prompting the Gemini team to apologize and pledge to make changes. The new implicit caching feature addresses these concerns by automating the caching process and passing on cost savings to developers when a cache hit occurs.

While Google claims that implicit caching can deliver 75% cost savings, the company did not provide third-party verification of the feature’s effectiveness. As such, the actual cost savings may vary depending on how developers use the feature.


Featured image credit

Tags: APIgeminiGoogle

Related Posts

Xbox Developer Direct returns January 22 with Fable and Forza Horizon 6

Xbox Developer Direct returns January 22 with Fable and Forza Horizon 6

January 9, 2026
Dell debuts disaggregated infrastructure for modern data centers

Dell debuts disaggregated infrastructure for modern data centers

January 9, 2026
TikTok scores partnership with FIFA for World Cup highlights

TikTok scores partnership with FIFA for World Cup highlights

January 9, 2026
YouTube now lets you hide Shorts in search results

YouTube now lets you hide Shorts in search results

January 9, 2026
Google transforms Gmail with AI Inbox and natural language search

Google transforms Gmail with AI Inbox and natural language search

January 9, 2026
Disney+ to launch TikTok-style short-form video feed in the US

Disney+ to launch TikTok-style short-form video feed in the US

January 9, 2026

LATEST NEWS

Xbox Developer Direct returns January 22 with Fable and Forza Horizon 6

Dell debuts disaggregated infrastructure for modern data centers

TikTok scores partnership with FIFA for World Cup highlights

YouTube now lets you hide Shorts in search results

Google transforms Gmail with AI Inbox and natural language search

Disney+ to launch TikTok-style short-form video feed in the US

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.