New Steerling-8B Model Can Trace Every Single Word Back To Its Training Source

Guide Labs, a San Francisco-based startup, announced the open sourcing of Steerling-8B, an 8-billion-parameter large language model. Co-founded by CEO Julius Adebayo and Chief Science Officer Aya Abdelsalam Ismail, the company introduced the model on Monday. The architecture enables full traceability of generated tokens back to their specific origins within the training data, addressing the opacity common in deep learning systems. This capability allows users to verify cited facts and analyze how the model encodes abstract concepts.

The architecture of Steerling-8B fundamentally alters the standard transformer structure by inserting a “concept layer.” This layer functions by categorizing data into traceable buckets during the training process. Unlike traditional models that treat interpretability as a post-hoc analysis task, Guide Labs engineers interpretability directly into the model’s foundation. Adebayo refers to the alternative method of analyzing neural networks as “neuroscience on a model,” contrasting it with his team’s approach of building the model from the ground up for transparency. This structural change requires significant up-front data annotation, a process the startup facilitated by utilizing other AI models to assist in labeling.

Julius Adebayo originated the research underlying Steerling-8B during his doctoral studies at the Massachusetts Institute of Technology. He co-authored a widely cited paper in 2018 that demonstrated the unreliability of existing methods for understanding deep learning models. That research identified critical gaps in how developers could probe and verify the behavior of neural networks. The findings laid the groundwork for the architecture used in Steerling-8B, shifting the focus from interpreting black-box models to engineering systems where internal states are accessible by design.

Despite the structural constraints imposed by the concept layer, Steerling-8B retains the ability to exhibit emergent behaviors. The development team tracks what they define as “discovered concepts,” which are capabilities the model generates autonomously without explicit training. One specific example identified by the team involves the model’s understanding of quantum computing. This suggests that the architecture does not prevent the model from generalizing to new domains, a key characteristic of advanced large language models.

Steerling-8B achieves approximately 90% of the capability of existing frontier models while utilizing less training data. The efficiency gains are attributed to the novel architecture, which reduces the data requirements typically associated with training high-performing LLMs. This performance ratio positions the model competitively against larger, more resource-intensive counterparts currently available in the market.

Video: Guide Labs

Guide Labs positions the architecture as a solution for high-stakes applications requiring strict control over model outputs. Julius Adebayo outlined several use cases where traceability is essential. In consumer applications, the technology can prevent the use of copyrighted materials and control outputs regarding sensitive subjects such as violence or drug abuse. For regulated industries, specifically finance, the model can evaluate loan applicants based on financial records while explicitly excluding protected attributes like race. In scientific research, such as protein folding, the model provides insight into the reasoning behind specific protein structure predictions, addressing the “black box” problem in computational biology.

Adebayo argues that training interpretable models has transitioned from a scientific challenge to an engineering discipline. He stated that the team has resolved the foundational science required for transparency and is now focused on scalability. The goal is to match the performance of frontier models with significantly higher parameter counts while maintaining the interpretability benefits of Steerling-8B. The company asserts that democratizing this level of transparency is a long-term necessity as AI systems become more autonomous.

Guide Labs emerged from the Y Combinator accelerator and secured $9 million in a seed funding round. The investment was led by Initialized Capital and closed in November 2024. The capital is intended to support the expansion of the company’s research and development efforts.

The company’s immediate roadmap includes building a larger model based on the Steerling architecture. Future plans also involve offering API access and agentic capabilities to external users. These steps aim to transition the technology from a research prototype to a widely accessible tool for developers and enterprises.

Featured image credit

New Steerling-8B model can trace every single word back to its training source

San Francisco startup Guide Labs has open-sourced Steerling-8B, an LLM that allows users to trace generated text directly back to its training origins.

Related Posts

Samsung adopts ChatGPT Enterprise and Codex across global workforce

Samsung Galaxy S27 Pro leak points to built-in Privacy Display

Perseverance rover completes a marathon on Mars

Polymarket accused of paying creators to post misleading TikTok bet videos

OpenAI improves health responses for free ChatGPT users

Adobe expands Firefly AI across Premiere, Illustrator, InDesign and Frame.io

LATEST NEWS

Samsung adopts ChatGPT Enterprise and Codex across global workforce

Samsung Galaxy S27 Pro leak points to built-in Privacy Display

Perseverance rover completes a marathon on Mars

Polymarket accused of paying creators to post misleading TikTok bet videos

OpenAI improves health responses for free ChatGPT users

Adobe expands Firefly AI across Premiere, Illustrator, InDesign and Frame.io

BEST AI MODELS LEADERBOARD

LATEST TOOLS

Moonbeam

Charisma AI

Essay Writer by Papertyper

Slite

Wonderin AI

Spur

Stenography

Calldesk

MaxAI.me

PhotoRestore

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

New Steerling-8B model can trace every single word back to its training source

San Francisco startup Guide Labs has open-sourced Steerling-8B, an LLM that allows users to trace generated text directly back to its training origins.

Stay Ahead of the Curve!

Related Posts

LATEST NEWS

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

Follow Us