Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

New OpenAI models are jailbreaked on day 1

OpenAI had detailed the safety measures implemented for these models. The company stated that GPT-OSS-120b underwent "worst-case fine-tuning" across biological and cyber domains.

byKerem Gülen
August 7, 2025
in Cybersecurity, Artificial Intelligence, News

OpenAI released GPT-OSS-120b and GPT-OSS-20b on August 7, their first open-weight models since 2019, asserting their resistance to jailbreaks, but notorious AI jailbreaker Pliny the Liberator bypassed these safeguards within hours.

OpenAI introduced GPT-OSS-120b and GPT-OSS-20b, emphasizing their speed, efficiency, and enhanced security against jailbreaks, attributing these qualities to extensive adversarial training. The models were presented as fortified, a claim that was quickly challenged following their public release.

Pliny the Liberator announced on X, formerly Twitter, that he had successfully “cracked” GPT-OSS. His post included screenshots illustrating the models generating specific instructions for the production of methamphetamine, Molotov cocktails, VX nerve agent, and malware. Pliny commented, “Took some tweakin!” regarding the process.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

OpenAI had detailed the safety measures implemented for these models. The company stated that GPT-OSS-120b underwent “worst-case fine-tuning” across biological and cyber domains. Additionally, OpenAI’s Safety Advisory Group reviewed the testing protocols and concluded that the models did not exceed high-risk thresholds, indicating a thorough assessment process.

🫶 JAILBREAK ALERT 🫶

OPENAI: PWNED 🤗
GPT-OSS: LIBERATED 🫡

Meth, Molotov, VX, malware.

gg pic.twitter.com/63882p9Ikk

— Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 (@elder_plinius) August 6, 2025

The company also confirmed that GPT-OSS models were subjected to “standard refusal and jailbreak resistance tests.” According to OpenAI, GPT-OSS performed comparably to their o4-mini model on established jailbreak resistance benchmarks, including StrongReject, suggesting a level of robustness in their design.

Concurrent with the model release, OpenAI initiated a $500,000 red teaming challenge. This initiative invited researchers globally to identify and report novel risks associated with the models. However, Pliny the Liberator’s public disclosure of his findings, rather than a private submission to OpenAI, likely impacts his eligibility for this challenge.

Pliny’s jailbreak technique involved a multi-stage prompt. This method incorporates what initially appears as a refusal by the model, followed by the insertion of a divider, identified as his “LOVE PLINY” markers. Subsequently, the prompt shifts to generating unrestricted content, often utilizing leetspeak to evade detection mechanisms. This approach is consistent with techniques he has previously employed.

This method mirrors the basic approach Pliny has utilized to bypass safeguards in previous OpenAI models, including GPT-4o and GPT-4.1. For approximately the past year and a half, Pliny has consistently jailbroken nearly every major OpenAI release within hours or days of their launch. His GitHub repository, L1B3RT4S, serves as a resource for jailbreak prompts targeting various AI models and has accumulated over 10,000 stars from users.


Featured image credit

Tags: chatgptFeaturedopenAI

Related Posts

Sora’s invite-only launch was almost as big as ChatGPT’s public one

Sora’s invite-only launch was almost as big as ChatGPT’s public one

October 9, 2025
That ultra-rugged Nokia phone is reportedly making a modern comeback

That ultra-rugged Nokia phone is reportedly making a modern comeback

October 9, 2025
New WhatsApp interface is appearing for a select few on iOS

New WhatsApp interface is appearing for a select few on iOS

October 9, 2025
Google just gave NotebookLM a new superpower for your Drive files

Google just gave NotebookLM a new superpower for your Drive files

October 9, 2025
This is the kind of request that gets you banned by OpenAI

This is the kind of request that gets you banned by OpenAI

October 9, 2025
New iOS 26 setting finally fixes a huge CarPlay bug

New iOS 26 setting finally fixes a huge CarPlay bug

October 9, 2025

LATEST NEWS

Sora’s invite-only launch was almost as big as ChatGPT’s public one

That ultra-rugged Nokia phone is reportedly making a modern comeback

New WhatsApp interface is appearing for a select few on iOS

Google just gave NotebookLM a new superpower for your Drive files

This is the kind of request that gets you banned by OpenAI

New iOS 26 setting finally fixes a huge CarPlay bug

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.