Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Stanford’s Evo AI designs novel proteins using genomic language models

Evo was trained on 1.7 million individual genes to predict nucleotide sequences.

byEmre Çıtak
December 1, 2025
in Research
Home Research
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

Stanford University researchers have developed Evo, a genomic language model trained on bacterial genomes, capable of designing novel proteins and nucleic acid sequences.

Evo’s development leverages the common bacterial genomic feature of genes with related functions clustering together. These gene clusters often transcribe into a single messenger RNA, enabling bacteria to regulate entire biochemical pathways efficiently.

The researchers trained Evo using an extensive collection of bacterial genomes. Similar to large language models, Evo was tasked with predicting the next base in a sequence and rewarded for accurate predictions. This generative model can produce novel sequences from prompts, introducing a degree of randomness in its outputs.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

This setup allows Evo to link nucleotide-level patterns to kilobase-scale genomic context. When prompted with a large segment of genomic DNA, Evo interprets it and generates an appropriate genomic output.

The team hypothesized that providing Evo with a known gene as a prompt would result in outputs encoding proteins with related functions. A key question was whether Evo would generate sequences for already known proteins or produce less predictable, novel outputs.

Initial testing involved prompting Evo with fragments of known protein genes. Given 30 percent of a known protein gene sequence, Evo completed 85 percent of the remainder. With 80 percent of the sequence, it restored all of the missing sequence. When a single gene was deleted from a functional cluster, Evo accurately identified and restored the missing gene.

Evo’s extensive training data ensured it identified critical protein regions. Sequence changes typically occurred in areas where variability is tolerated, indicating the system incorporated evolutionary limits on genetic changes.

To test Evo’s ability to generate novel outputs, researchers used bacterial toxins, which are often co-encoded with anti-toxins. They provided Evo with a toxin only mildly related to known ones, lacking a known antitoxin, and filtered out responses resembling known antitoxin genes.

Testing 10 of Evo’s outputs, five rescued some toxicity, and two fully restored growth in bacteria producing the toxin. These two antitoxins showed only about 25 percent sequence identity to known anti-toxins. They were assembled from parts of 15 to 20 individual proteins; one example required patching from 40 known proteins.

Evo’s capabilities extended beyond proteins. When applied to a toxin with an RNA-based inhibitor, the system generated DNA encoding RNAs with correct structural features, despite having sequences unrelated to known RNA inhibitors.

A similar test involved inhibitors of the CRISPR system. The team filtered outputs to include only protein-encoding sequences dissimilar to known proteins. Of these, 17 percent inhibited CRISPR function. Two of these inhibitors had no similarity to any known proteins and confounded software designed for 3D protein structure prediction.

Evo appears capable of generating entirely novel, functional proteins without considering protein structure.

The researchers prompted Evo with 1.7 million individual genes from bacteria and their viruses, resulting in 120 billion base pairs of AI-generated DNA, including both known and potentially novel genetic material.

This approach may not translate to more complex genomes like vertebrates, which typically do not cluster genes with related functions and possess more intricate gene structures. This method addresses different problems than directed design efforts, such as developing plastic-digesting enzymes. The findings were published in Nature in 2025.


Featured image credit

Tags: AIevogenome

Related Posts

Scientists discover more than 17,000 new species

Scientists discover more than 17,000 new species

December 25, 2025
GPT-5.2 surpasses expert PhD baseline with 92% science score

GPT-5.2 surpasses expert PhD baseline with 92% science score

December 24, 2025
Why DIG AI is the most dangerous malicious AI of 2025

Why DIG AI is the most dangerous malicious AI of 2025

December 23, 2025
Pew Research reveals significant racial gaps in teen AI chatbot usage

Pew Research reveals significant racial gaps in teen AI chatbot usage

December 23, 2025
MIT’s JETS model predicts disease from Apple Watch data

MIT’s JETS model predicts disease from Apple Watch data

December 22, 2025
Sodium-ion batteries edge closer to fast charging as researchers crack ion bottlenecks

Sodium-ion batteries edge closer to fast charging as researchers crack ion bottlenecks

December 19, 2025

LATEST NEWS

Samsung Bixby gains Perplexity AI search powers in new update

Boomerang challenges WeTransfer with login-free file sharing

Netflix could lose over 100 original shows in 2026

Samsung TVs gain Google Photos integration ahead of CES 2026

Ubisoft shuts down Rainbow Six Siege X following 13 million dollar exploit

Police charge woman for killing pedestrian on TikTok live

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.