Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

These words scream AI-crafted content

Researchers analyzed "excess word usage" in scientific abstracts published on PubMed from 2010 to 2024

byKerem Gülen
July 10, 2024
in Artificial Intelligence
Home News Artificial Intelligence
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

Researchers have developed a new technique to estimate the prevalence of large language model (LLM) usage in scientific writing, according to an Ars Technica report. This method relies on identifying “excess words” that have surged in frequency since the advent of LLMs in 2023.

Introduction of a new detection method

The challenge of detecting AI-generated text has perplexed AI companies and researchers alike. However, a recent pre-print paper from researchers at the University of Tubingen and Northwestern University proposes a unique solution. By examining the sudden rise in specific vocabulary within scientific abstracts, they offer a novel way to identify the influence of LLMs on academic writing.

Inspiration from pandemic studies

The researchers drew inspiration from studies that measured the impact of the COVID-19 pandemic through excess deaths compared to historical data. Applying a similar approach, they analyzed “excess word usage” in scientific abstracts published on PubMed from 2010 to 2024. This comparison revealed significant changes in vocabulary coinciding with the widespread adoption of LLMs in late 2022.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

These words scream AI-crafted content
These words scream AI-crafted content (Image credit)

Analyzing the data

To measure these changes, the team scrutinized 14 million abstracts, tracking the frequency of each word annually. By comparing the expected word frequency, based on pre-2023 trends, to actual usage in 2023 and 2024, they identified a dramatic increase in certain terms. For example, the word “delves” appeared 25 times more frequently in 2024 abstracts than anticipated. Similarly, “showcasing” and “underscores” saw a ninefold increase in usage.

Here are the most used words in AI-generated text with their corresponding rates of increase in usage:

  • Delves – 25 times increase
  • Showcasing – 9 times increase
  • Underscores – 9 times increase
  • Potential – 4.1 percentage points increase
  • Findings – 2.7 percentage points increase
  • Crucial – 2.6 percentage points increase
  • Across – significant increase (exact rate not specified)
  • Additionally – significant increase (exact rate not specified)
  • Comprehensive – significant increase (exact rate not specified)
  • Enhancing – significant increase (exact rate not specified)
  • Exhibited – significant increase (exact rate not specified)
  • Insights – significant increase (exact rate not specified)
  • Notably – significant increase (exact rate not specified)
  • Particularly – significant increase (exact rate not specified)
  • Within – significant increase (exact rate not specified)

The specific rates for words 7 to 15 were not provided but were noted as having pronounced increases in scientific usage in the post-LLM era.

Vocabulary shifts

This surge in specific words, dubbed “marker words,” is a key indicator of LLM usage. While language naturally evolves, such abrupt and widespread changes were previously only associated with significant global events like health crises. The researchers noted that, unlike the noun-heavy vocabulary shifts during the COVID-19 pandemic, the post-LLM era saw a rise in verbs, adjectives, and adverbs.

By identifying these marker words, researchers can estimate that at least 10% of 2024 scientific abstracts were generated or assisted by LLMs. This figure likely underestimates the true extent, as not all LLM-assisted texts will include these specific markers.

These words scream AI-crafted content
The study also highlighted geographical differences in LLM usage (Image credit)

Geographical variations in LLM usage

The study also highlighted geographical differences in LLM usage. Papers from countries like China, South Korea, and Taiwan exhibited a higher frequency of marker words, suggesting LLMs are particularly useful for non-native English speakers in editing and composing scientific texts.

Conversely, native English speakers might be more adept at recognizing and removing these markers, thereby obscuring their use of LLMs.


Featured image credit: Glen Carrie/Unsplash

Tags: AIartificial intelligenceFeatured

Related Posts

ChatGPT hits 1 billion users as global AI adoption surges despite backlash

ChatGPT hits 1 billion users as global AI adoption surges despite backlash

June 12, 2026
OpenAI Codex referral program rewards users with extra rate resets

OpenAI Codex referral program rewards users with extra rate resets

June 12, 2026
Zuckerberg says small elite teams can drive major AI breakthroughs

Zuckerberg says small elite teams can drive major AI breakthroughs

June 12, 2026
Google says AI Overviews reach 2.5 billion monthly users

Google says AI Overviews reach 2.5 billion monthly users

June 12, 2026
Anthropic apologizes for hidden Fable throttling, pledges transparency

Anthropic apologizes for hidden Fable throttling, pledges transparency

June 11, 2026
Reco builds momentum to secure the enterprise AI agent sprawl

Reco builds momentum to secure the enterprise AI agent sprawl

June 11, 2026

LATEST NEWS

“Free robots are an illusion”: Why we’ll pay for system intelligence, not delivery workers

How Henrique Schmaiske led Meteor.js through its biggest transformation

Proven privacy: Why ‘no-log’ claims need real evidence today

ChatGPT hits 1 billion users as global AI adoption surges despite backlash

Huawei launches HarmonyOS 7 developer beta with upgraded API 26

OpenAI Codex referral program rewards users with extra rate resets

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Roboto AI

Pickaxe

Pfpmaker

MindPal

Syllaby

ScreenApp

FinanceBrain

GitHub Spark

Hints

VisionStory AI

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.