Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Researchers find way to bypass Apple’s on-device LLM safeguards

The researchers noted uncertainty regarding how Apple's model manages input and output filtering, owing to the company's non-disclosure of operational specifics.

byAytun Çelebi
April 10, 2026
in Research
Home Research
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

Researchers identified a method to bypass Apple’s safeguards, enabling its on-device language model to carry out attacker-defined actions through prompt injection. Apple has responded by enhancing its security measures against such vulnerabilities.

The findings, detailed in two blog posts on the RSAC blog via AppleInsider, highlight significant security concerns pertaining to Apple’s model. The researchers merged two exploit techniques to compel the model to disregard safety protocols and successfully navigate content filters.

The researchers noted uncertainty regarding how Apple’s model manages input and output filtering, owing to the company’s non-disclosure of operational specifics. They suspect an input filter exists that assesses prompts for unsafe content before forwarding them to the model, followed by an output filter that evaluates responses.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

In their approach, researchers reversed harmful strings and utilized the Unicode RIGHT-TO-LEFT OVERRIDE character to disguise these strings on user screens while keeping them flagged for inspection in raw input. This tactic allowed them to embed harmful strings within a secondary method known as Neural Exec, effectively overriding the model’s original instructions.

The combined effectiveness of these techniques facilitated the circumventing of Apple’s content filters, enabling the model to misinterpret intended commands. To rigorously test this exploit, the researchers established three distinct categories of input prompts: system prompts, harmful strings, and benign inputs drawn from random Wikipedia articles.

During trials, utilizing prompts from these pools resulted in a 76% success rate across 100 test prompts. The researchers disclosed their findings to Apple in October 2025, prompting the company to bolster their protections, which were implemented in updates for iOS 26.4 and macOS 26.4.

Apple confirmed that they have subsequently intensified security measures to guard against this type of attack, reinforcing the integrity of their models and safeguarding user interactions.


Featured image credit

Tags: Apple Intelligence

Related Posts

European consumers may leave businesses using US tech providers

European consumers may leave businesses using US tech providers

June 24, 2026
Study links AI-assisted homework to lower exam scores

Study links AI-assisted homework to lower exam scores

June 22, 2026
Harvard and Boston Children’s use AI to revisit unsolved genetic cases

Harvard and Boston Children’s use AI to revisit unsolved genetic cases

June 19, 2026
Adobe report finds 86% of creators now use generative AI in workflows

Adobe report finds 86% of creators now use generative AI in workflows

June 17, 2026
AI transfer learning speeds cosmology research but has hidden risks

AI transfer learning speeds cosmology research but has hidden risks

June 15, 2026
Phishing scams targeting travelers hit record levels in 2026

Phishing scams targeting travelers hit record levels in 2026

June 15, 2026

LATEST NEWS

ByteDance launches Doubao 2.1 Pro language model

OpenAI expands cybersecurity efforts with Patch the Planet

Meta launches $299 smart glasses under its own brand

Claude Tag brings shared AI assistant to Slack channels

PlayStation 6 leak points to 2027 release window

Samsung unveils UFS 5.0 storage for future Galaxy phones

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Moonbeam

Charisma AI

Essay Writer by Papertyper

Slite

Wonderin AI

Spur

Stenography

Calldesk

MaxAI.me

PhotoRestore

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.