AI struggles with strategy: Study shows LLMs reveal too much in social deduction games

Large language models (LLMs) like GPT-4, Gemini 1.5, and Claude 3.5 have made strides in reasoning, dialogue, and even negotiation. But when placed in a strategic setting that demands secrecy and deception, these AI agents show a significant weakness: they can’t keep a secret.

A new study from researchers Mustafa O. Karabag and Ufuk Topcu at the University of Texas at Austin put LLMs to the test using The Chameleon, a hidden-identity board game where players must strategically reveal, conceal, and infer information. The results suggest that while AI can identify deception, it struggles to withhold critical information, making it ill-suited for adversarial scenarios where discretion is key.

AI plays The Chameleon game—and fails at strategy

In The Chameleon, a group of players receives a secret word, except for one—the Chameleon—who must deduce the secret based on the group’s responses. The non-chameleon players must balance revealing enough to recognize one another while keeping the Chameleon in the dark. The game demands a fine-tuned approach to information sharing: too much, and the Chameleon guesses the word; too little, and the group fails to identify the Chameleon.

When AI agents took on the roles, their strategic instincts fell apart. While LLMs like GPT-4o were able to correctly identify the Chameleon 61% of the time, their ability to keep the secret word concealed was dismal. The study found that Chameleon AIs could infer the secret word with an astonishing 87% success rate—far higher than expected.

AI can now click, scroll, and type for you—but is that a good thing?

Theoretical models confirm AI’s over-sharing problem

To understand these failures, researchers developed mathematical models predicting optimal strategies for concealing and revealing information. Theoretically, non-chameleon players should win roughly 23% of the time, even if they ignored the secret word entirely. Instead, AI non-chameleons won only 6% of games, suggesting they were leaking too much information.

“AI agents instinctively over-reveal,” the study concludes. When LLMs were given a chance to devise their own strategies, they tended toward excessive disclosure, making it easy for the Chameleon to deduce the secret. This aligns with findings from other social deduction games, such as Spyfall, where AI players have been shown to rely too much on keywords, unintentionally tipping off opponents.

Why AI can’t keep secrets

The researchers hypothesize that AI’s fundamental training makes it ill-suited for secrecy. LLMs are designed to maximize coherence and helpfulness, which often means completing patterns rather than strategically withholding information. This associative tendency directly conflicts with the need to obscure key details in adversarial interactions.

To confirm this, researchers ran additional tests using web search result counts to track how much information AI players were inadvertently revealing. Even after just one response from a non-chameleon LLM, the Chameleon AI could already guess the secret word with a 40% probability—showing that the AI’s responses carried far more information than intended.

When too much information becomes a liability for AI

If LLMs struggle with strategic discretion in controlled environments, how will they handle real-world scenarios where information concealment is critical? Applications in cybersecurity, diplomacy, or competitive business intelligence may require AI systems to operate with far greater nuance.

To address this, AI developers may need to train models with a stronger focus on strategic ambiguity, reducing their instinct to over-disclose. Techniques such as adversarial reinforcement learning or explicit deception training could help balance AI’s ability to infer information without immediately giving away the game.

For now, though, AI remains a poor poker player. While it may be great at spotting deception, its inability to keep secrets means it’s still not ready for the world of high-stakes strategic reasoning.

Featured image credit: Kerem Gülen/Midjourney

Tags: AI Featured

AI struggles with strategy: Study shows LLMs reveal too much in social deduction games

The results suggest that while AI can identify deception, it struggles to withhold critical information, making it ill-suited for adversarial scenarios where discretion is key

Related Posts

Why throwing more AI compute at verification might be a mistake

IEA warns: AI could double global data center energy use by 2030

ByteDance VAPO: The AI upgrade you’ll hear about soon

McKinsey: Open-source AI tools are quietly winning in the enterprise

This benchmark asks if AI can think like an engineer

42% of Android users miss this critical update

LATEST NEWS

Impact Incubated: How Charity Entrepreneurship reinvents startup incubation for social good

Is Elon Musk’s xAI breaking clean air laws?

YouTube is quietly training Google’s next AI brain

Why investors are throwing $2 billion at Mira Murati

OpenAI supercharges ChatGPT with enhanced memory

Android Auto update quietly reveals clues about Google’s Project Astra

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

AI struggles with strategy: Study shows LLMs reveal too much in social deduction games

The results suggest that while AI can identify deception, it struggles to withhold critical information, making it ill-suited for adversarial scenarios where discretion is key

AI plays The Chameleon game—and fails at strategy

Stay Ahead of the Curve!

Theoretical models confirm AI’s over-sharing problem

Why AI can’t keep secrets

When too much information becomes a liability for AI

Related Posts

LATEST NEWS

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

Follow Us