Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

New OpenAI models are jailbreaked on day 1

OpenAI had detailed the safety measures implemented for these models. The company stated that GPT-OSS-120b underwent "worst-case fine-tuning" across biological and cyber domains.

byKerem Gülen
August 7, 2025
in Cybersecurity, Artificial Intelligence, News
Home News Cybersecurity
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

OpenAI released GPT-OSS-120b and GPT-OSS-20b on August 7, their first open-weight models since 2019, asserting their resistance to jailbreaks, but notorious AI jailbreaker Pliny the Liberator bypassed these safeguards within hours.

OpenAI introduced GPT-OSS-120b and GPT-OSS-20b, emphasizing their speed, efficiency, and enhanced security against jailbreaks, attributing these qualities to extensive adversarial training. The models were presented as fortified, a claim that was quickly challenged following their public release.

Pliny the Liberator announced on X, formerly Twitter, that he had successfully “cracked” GPT-OSS. His post included screenshots illustrating the models generating specific instructions for the production of methamphetamine, Molotov cocktails, VX nerve agent, and malware. Pliny commented, “Took some tweakin!” regarding the process.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

OpenAI had detailed the safety measures implemented for these models. The company stated that GPT-OSS-120b underwent “worst-case fine-tuning” across biological and cyber domains. Additionally, OpenAI’s Safety Advisory Group reviewed the testing protocols and concluded that the models did not exceed high-risk thresholds, indicating a thorough assessment process.

🫶 JAILBREAK ALERT 🫶

OPENAI: PWNED 🤗
GPT-OSS: LIBERATED 🫡

Meth, Molotov, VX, malware.

gg pic.twitter.com/63882p9Ikk

— Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 (@elder_plinius) August 6, 2025

The company also confirmed that GPT-OSS models were subjected to “standard refusal and jailbreak resistance tests.” According to OpenAI, GPT-OSS performed comparably to their o4-mini model on established jailbreak resistance benchmarks, including StrongReject, suggesting a level of robustness in their design.

Concurrent with the model release, OpenAI initiated a $500,000 red teaming challenge. This initiative invited researchers globally to identify and report novel risks associated with the models. However, Pliny the Liberator’s public disclosure of his findings, rather than a private submission to OpenAI, likely impacts his eligibility for this challenge.

Pliny’s jailbreak technique involved a multi-stage prompt. This method incorporates what initially appears as a refusal by the model, followed by the insertion of a divider, identified as his “LOVE PLINY” markers. Subsequently, the prompt shifts to generating unrestricted content, often utilizing leetspeak to evade detection mechanisms. This approach is consistent with techniques he has previously employed.

This method mirrors the basic approach Pliny has utilized to bypass safeguards in previous OpenAI models, including GPT-4o and GPT-4.1. For approximately the past year and a half, Pliny has consistently jailbroken nearly every major OpenAI release within hours or days of their launch. His GitHub repository, L1B3RT4S, serves as a resource for jailbreak prompts targeting various AI models and has accumulated over 10,000 stars from users.


Featured image credit

Tags: chatgptFeaturedopenAI

Related Posts

Amazon adds AI-generated product previews to search results

Amazon adds AI-generated product previews to search results

June 4, 2026
Meta launches AI business agents on WhatsApp, Instagram and Messenger

Meta launches AI business agents on WhatsApp, Instagram and Messenger

June 4, 2026
Nintendo will release a repair-friendly Switch 2 in Europe

Nintendo will release a repair-friendly Switch 2 in Europe

June 4, 2026
Google rolls out Ask Gemini in Drive to eligible Workspace users

Google rolls out Ask Gemini in Drive to eligible Workspace users

June 4, 2026
Google Wallet to add digital IDs from select EU countries this summer

Google Wallet to add digital IDs from select EU countries this summer

June 4, 2026
Why Telegram Mini Apps have become the optimal ecosystem for launching AI SaaS products

Why Telegram Mini Apps have become the optimal ecosystem for launching AI SaaS products

June 3, 2026

LATEST NEWS

Amazon adds AI-generated product previews to search results

Meta launches AI business agents on WhatsApp, Instagram and Messenger

Nintendo will release a repair-friendly Switch 2 in Europe

Google rolls out Ask Gemini in Drive to eligible Workspace users

Google Wallet to add digital IDs from select EU countries this summer

Why Telegram Mini Apps have become the optimal ecosystem for launching AI SaaS products

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Roboto AI

Pickaxe

Pfpmaker

MindPal

Syllaby

ScreenApp

FinanceBrain

GitHub Spark

Hints

VisionStory AI

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.