AI is good at pattern recognition but struggles with reasoning. Meanwhile, human cognition is deeply rooted in logic and coherence. What if we could combine the best of both worlds—the raw processing power of Large Language Models (LLMs) and the structured, rule-based thinking of symbolic AI?
This is the goal behind Neurosymbolic AI, a new approach that merges deep learning with coherence-driven inference (CDI). Researchers Huntsman and Thomas propose a method that allows LLMs to construct logical relationships from natural language, opening the door to better decision-making, problem-solving, and even AI-driven legal reasoning.
This study, “Neurosymbolic Artificial Intelligence via Large Language Models and Coherence-Driven Inference,” conducted by Steve Huntsman and Jewell Thomas from Cynnovative, explores a novel approach to integrating coherence-driven inference (CDI) with large language models (LLMs). By developing an algorithm that transforms natural language propositions into structured coherence graphs, the researchers benchmark AI models’ ability to reconstruct logical relationships.
What is coherence-driven inference?
Coherence-driven inference (CDI) is a way of making decisions based on how well a set of propositions fit together. Instead of simply accepting or rejecting individual facts, CDI builds a graph of relationships, assigning weights to consistent and contradictory statements.
A coherence graph would connect these propositions and score their consistency, helping AI determine which statements are most likely to be true.
The problem? Until now, these graphs had to be manually built—a painstaking and impractical process. The new research proposes an algorithm that can automatically generate these graphs from natural language input and test how well LLMs can reconstruct them.
Why small AI models can’t keep up with large ones
Teaching LLMs to build logical structures
The researchers’ method involves two key steps:
- Generating Propositions: A set of statements is created in natural language, reflecting a logical structure.
- Reconstructing Coherence Graphs: LLMs are then prompted to analyze these statements and rebuild the underlying graph structure.
By doing this, AI models are forced to think more like humans, evaluating not just individual facts but how they connect to a broader web of knowledge.
Can AI get it right?
The study tested various LLMs, from GPT-4o and Claude 3.5 to open-source models like Qwen-32B and Llama-3.3. The results were surprisingly promising—some models were able to reconstruct coherence graphs with high accuracy, even under uncertain or ambiguous conditions.
Interestingly, models optimized for reasoning, like o1-mini and QwQ-32B, performed the best. This suggests that AI systems specifically trained for structured problem-solving can outperform general-purpose LLMs when handling complex reasoning tasks.
At the core of coherence-driven inference (CDI) is the idea that knowledge isn’t just a collection of isolated facts—it’s a network of interdependent truths. The method introduced by Huntsman and Thomas structures knowledge as a coherence graph, where:
- Nodes represent propositions (e.g., “The Hague is the capital”).
- Edges represent consistency or inconsistency between those propositions.
If a proposition supports another, it gets a positive connection. If two statements contradict, they receive a negative connection. The goal? To maximize coherence by separating true and false statements into different clusters.
The problem of finding the most coherent partition in a graph turns out to be mathematically equivalent to MAX-CUT, a well-known computational challenge. Neurosymbolic AI tackles this by combining LLMs’ natural language understanding with CDI’s graph-based reasoning.
The researchers’ approach takes inspiration from both psychology and computer science. CDI has been used to model human decision-making, legal reasoning, and even causal inference in science. But until now, CDI graphs had to be manually constructed.
To automate this process, the study proposes an algorithm that:
- Transforms natural language propositions into a structured coherence graph.
- Trains LLMs to reconstruct these graphs, testing their ability to identify relationships between facts.
- Benchmarks performance across different AI models, analyzing how well they preserve logical consistency.
To test how well LLMs handle coherence-driven inference, the researchers generated synthetic coherence graphs and fed them into various AI models. These graphs contained both consistent and contradictory statements, challenging the models to identify logical structures rather than just regurgitate information.
They tested:
- Claude 3.5 and GPT-4o (high-end commercial LLMs)
- QwQ-32B, o1-mini, and Llama-3 (open-source models optimized for reasoning)
- Phi-4 and Gemini 1.5/2.0 (smaller and mid-sized models)
The results showed that:
- Models optimized for reasoning (like o1-mini and QwQ-32B) significantly outperformed general-purpose LLMs.
- Some models successfully reconstructed the original coherence graph—even when faced with uncertain or ambiguous information.
- LLMs struggled with more complex logical structures, particularly when multiple propositions were interdependent.
This research is an important step toward truly intelligent AI. Instead of treating language as a statistical guessing game, coherence-driven inference forces AI to evaluate logical consistency, leading to:
- More reliable AI outputs (less hallucination, more accurate reasoning)
- Better explainability (AI decisions based on explicit logical structures)
- Enhanced problem-solving (applicable to law, science, and governance)
Featured image credit: Tara Winstead/Pexels