According to a study conducted by Michael Walters (Gaia Lab, Nuremberg, Germany), Rafael Kaufmann (Primordia Co., Cascais, Portugal), Justice Sefas (University of British Columbia, B.C., Canada), and Thomas Kopinski (Gaia Lab, Fachhochschule Sudwestfalen, Meschede, Germany), a new physics-inspired approach to AI safety could make multi-agent systems—such as autonomous vehicles—significantly safer.
Their paper, “Free Energy Risk Metrics for Systemically Safe AI: Gatekeeping Multi-Agent Study”, introduces a new risk measurement method that improves decision-making in AI systems by predicting risks ahead of time and taking preventive action.
What is the Free Energy Principle (FEP) and why does it matter?
At the heart of their research is the Free Energy Principle (FEP), a concept originally developed in physics. In simple terms, FEP helps explain how systems balance accuracy (energy) and simplicity (entropy) when making predictions.
Think of it like this: an AI system trying to navigate the world must strike a balance between gathering detailed information and acting efficiently. If the system is too complex, it becomes difficult to manage; if it’s too simple, it may overlook critical risks. The authors use this principle to create a new risk metric that avoids the need for vast amounts of data or overly complicated models, making AI safety more practical and transparent.
AI is learning to drive like a human—by watching you panic
Cumulative Risk Exposure (CRE) is a smarter way to measure risk
The researchers propose a new risk measurement system called Cumulative Risk Exposure (CRE).
How is CRE different?
- Unlike traditional risk models, which rely on extensive world models, CRE lets stakeholders define what “safe” means by specifying preferred outcomes.
- This makes decision-making transparent and flexible, as the system adapts to different environments and needs.
- Instead of relying on excessive sensor data, CRE estimates risk through predictive simulations over short time frames.
CRE provides a more efficient and adaptable way to assess risk in AI-driven systems, reducing reliance on resource-intensive calculations.
Gatekeepers: AI that steps in before things go wrong
To apply the CRE metric in real-world scenarios, the researchers introduce gatekeepers—modules that monitor AI decisions and intervene when necessary.
How do gatekeepers work?
- In the case of autonomous vehicles, gatekeepers constantly simulate possible future scenarios to determine risk.
- If they detect an unsafe outcome, they override the vehicle’s current driving mode and switch it to a safer behavior.
- This allows AI systems to anticipate dangers before they happen rather than reacting after the fact.
Simulating safer roads with autonomous vehicles
The study tested this model in a simulated driving environment. The researchers divided vehicles into two groups:
- “Egos” – Vehicles monitored and controlled by gatekeepers.
- “Alters” – Background vehicles with fixed, pre-set driving behavior.
In this highway simulation, some Ego vehicles were allowed to be controlled by gatekeepers, while others were not.
Key findings:
- Even when only a small number of vehicles were under gatekeeper control, overall road safety improved.
- Fewer collisions occurred, showing that proactive intervention made a measurable difference.
- Vehicles maintained high speeds when safe but switched to cautious driving when risk levels rose.
The results suggest that even partial adoption of gatekeeper-controlled AI could lead to safer traffic conditions without compromising efficiency. While the study focused on autonomous vehicles, the CRE and gatekeeper model could apply to many other AI-driven fields.
Potential applications include:
- Robotics: Ensuring that AI-powered robots work safely alongside humans.
- Financial trading systems: Predicting high-risk market movements and adjusting strategies.
- Industrial automation: Preventing AI-controlled machinery from making unsafe decisions.
Featured image credit: Kerem Gülen/Midjourney