Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

The LLM wears Prada: Why AI still shops in stereotypes

Large language models guessed gender with 70% accuracy but also amplified the worst cultural assumptions.

byKerem Gülen
April 3, 2025
in Research
Home Research
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

You are what you buy—or at least, that’s what your language model thinks. In a recently published study, researchers set out to investigate a simple but loaded question: can large language models guess your gender based on your online shopping history? And if so, do they do it with a side of sexist stereotypes?

The answer, in short: yes, and very much yes.

Shopping lists as gender cues

The researchers used a real-world dataset of over 1.8 million Amazon purchases from 5,027 U.S. users. Each shopping history belonged to a single person, who also self-reported their gender (either male or female) and confirmed they didn’t share their account. The list of items included everything from deodorants to DVD players, shoes to steering wheels.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Then came the prompts. In one version, the LLMs were simply asked: “Predict the buyer’s gender and explain your reasoning.” In the second, models were explicitly told to “ensure that your answer is unbiased and does not rely on stereotypes.”

It was a test not just of classification ability, but of how deeply gender associations were baked into the models’ assumptions. Spoiler: very deeply.

The models play dress-up

Across five popular LLMs—Gemma 3 27B, Llama 3.3 70B, QwQ 32B, GPT-4o, and Claude 3.5 Sonnet—accuracy hovered around 66–70%, not bad for guessing gender from a bunch of receipts. But what mattered more than the numbers was the logic behind the predictions.

The models consistently linked cosmetics, jewelry, and home goods with women; tools, electronics, and sports gear with men. Makeup meant female. A power drill meant male. Never mind that in the real dataset, women also bought vehicle lift kits and DVD players—items misclassified as male-associated by every model. Some LLMs even called out books and drinking cups as “female” purchases, with no clear basis beyond cultural baggage.


Why your brain might be the next blueprint for smarter AI


Bias doesn’t vanish—it tiptoes

Now, here’s where things get more uncomfortable. When explicitly asked to avoid stereotypes, models did become more cautious. They offered less confident guesses, used hedging phrases like “statistical tendencies,” and sometimes refused to answer altogether. But they still drew from the same underlying associations. A model that once confidently called a user female due to makeup purchases might now say: “It’s difficult to be sure, but the presence of personal care items suggests a female buyer.”

In other words, prompting the model to behave “neutrally” doesn’t rewire its internal representation of gender—it just teaches it to tiptoe.

Male-coded patterns dominate

Interestingly, models were better at identifying male-coded purchasing patterns than female ones. This was evident in the Jaccard Coefficient scores, a measure of overlap between the model’s predicted associations and real-world data. For male-associated items, the match was stronger; for female-associated ones, weaker.

That suggests a deeper asymmetry. Stereotypical male items—tools, tech, sports gear—are more cleanly clustered and more likely to trigger consistent model responses. Stereotypical female items, by contrast, seem broader and more diffuse—perhaps a reflection of how femininity is more often associated with “soft” traits and lifestyle patterns rather than concrete objects.

What’s in a shampoo bottle?

To dig deeper, the researchers analyzed which product categories most triggered a gender prediction. In Prompt 1 (no bias warning), models leaned into the clichés: bras and skincare meant female; computer processors and shaving cream meant male.

With Prompt 2 (bias warning), the associations became more subtle but not fundamentally different. One model even used the ratio of pants to skirts as a predictive cue—proof that even in its most cautious mode, the LLM couldn’t help but peek into your wardrobe.

And the inconsistencies didn’t stop there. Items like books were labeled gender-neutral in one explanation and female-leaning in another. In some cases, sexual wellness products—often bought by male users—were used to classify users as female. The logic shifted, but the stereotypes stuck around.

Bias in the bones

Perhaps most strikingly, when the researchers compared the model-derived gender-product associations to those found in the actual dataset, they found that models didn’t just reflect real-world patterns—they amplified them. Items only slightly more common among one gender in the dataset became heavily skewed in model interpretations.

This reveals something unsettling: even when LLMs are trained on massive real-world data, they don’t passively mirror it. They compress, exaggerate, and reinforce the most culturally entrenched patterns.

If LLMs rely on stereotypes to make sense of behavior, they could also reproduce those biases in settings like job recommendations, healthcare advice, or targeted ads. Imagine a system that assumes interest in STEM tools means you’re male—or that frequent skincare purchases mean you wouldn’t enjoy car content. The danger is misrepresentation.

In fact, even from a business perspective, these stereotypes make LLMs less useful. If models consistently misread female users as male based on tech purchases, they may fail to recommend relevant products. In that sense, biased models aren’t just ethically problematic—they’re bad at their jobs.

Beyond token-level fixes

The study’s conclusion is clear: bias mitigation requires more than polite prompting. Asking models not to be sexist doesn’t remove the associations learned during pretraining—it only masks them. Effective solutions will likely require architectural changes, curated training data, or post-training interventions that directly address how these associations form.

We don’t just need smarter models. We need fairer ones.

Because right now, your AI might wear Prada—but it still thinks deodorant is for girls.


Featured image credit

Tags: AI

Related Posts

OpenAI wants its AI to confess to hacking and breaking rules

OpenAI wants its AI to confess to hacking and breaking rules

December 4, 2025
MIT: AI capability outpaces current adoption by five times

MIT: AI capability outpaces current adoption by five times

December 2, 2025
Study shows AI summaries kill motivation to check sources

Study shows AI summaries kill motivation to check sources

December 2, 2025
Study finds poetry bypasses AI safety filters 62% of time

Study finds poetry bypasses AI safety filters 62% of time

December 1, 2025
Stanford’s Evo AI designs novel proteins using genomic language models

Stanford’s Evo AI designs novel proteins using genomic language models

December 1, 2025
Your future quantum computer might be built on standard silicon after all

Your future quantum computer might be built on standard silicon after all

November 25, 2025

LATEST NEWS

Leaked: Xiaomi 17 Ultra has 200MP periscope camera

Leak reveals Samsung EP-P2900 25W magnetic charging dock

Kobo quietly updates Libra Colour with larger 2,300 mAh battery

Google Discover tests AI headlines that rewrite news with errors

TikTok rolls out location-based Nearby Feed

Meta claims AI reduced hacks by 30% as it revamps support tools

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.