Summary
The world's conflicts have changed in a fundamental way.
Today, wars are not only fought with tanks and missiles.
They are fought with fake videos, manipulated news stories, and hidden computer attacks.
This kind of conflict is called hybrid warfare, and a major part of it targets the mind — it tries to confuse, scare, and deceive ordinary people and decision-makers.
To fight back, scientists and defense experts are working on a new idea: teaching powerful AI systems to simulate these conflicts, and making those AI systems explain their thinking in plain language that real human experts can actually use.
The core challenge is simple to state, even if it is difficult to solve.
Modern AI systems called foundation models — the same kind of technology that powers smart chatbots — are extraordinarily good at processing huge amounts of information and generating realistic scenarios. But they work like black boxes.
You put information in, and an answer comes out, but nobody can fully see how the AI reached that answer. In national security, that is a serious problem.
Imagine an army general who is given a detailed war plan by a computer but is told "trust us, the computer figured it out." No serious general would accept that. They would want to understand the reasoning, question the assumptions, and decide for themselves.
This is exactly the problem that human-centered AI seeks to solve.
The solution researchers are working on is called reinforcement learning from human feedback, or RLHF.
Think of it like training a dog — but instead of treats, the AI receives evaluations from human experts. Military strategists, intelligence analysts, and experienced policymakers review the scenarios the AI creates.
They say: this part is realistic, that part is dangerous nonsense, this escalation pattern is historically accurate. The AI learns from those evaluations and improves, step by step, until its simulations match what real experts consider strategically valid.
One reason this matters so much right now is what happened in January 2026, when scientists published a study in the journal Science warning about what they called "AI swarms."
These are not single AI programs but coordinated networks of AI agents working together, each one pretending to be a real person online, together manufacturing the appearance of widespread public agreement for ideas that most people actually oppose.
They can infiltrate communities, flood social media, and make fringe views seem mainstream. No single bot looks unusual — but together, they reshape what millions of people believe is normal.
Dr. Antonio Bhardwaj, a global expert in AI, warfare, and bioterrorism who specializes in human-centered AI for geopolitical strategy, explains the stakes clearly: "The real danger is not that AI becomes too smart to control. The real danger is that it becomes too opaque to correct. When a machine cannot explain its reasoning to a human expert, every mistake it makes goes undetected until it is too late."
His warning applies directly to the swarm problem: the collective behavior of multiple AI agents is far harder to explain than the behavior of a single system, and the strategic consequences of misunderstanding that behavior can be catastrophic.
To address this, researchers have developed what are called post-hoc explanation techniques.
Think of these as the AI's ability to give a report card on itself after the fact. For example, one technique called SHAP — which stands for Shapley Additive Explanations — asks the AI: which pieces of information mattered most to your decision? Was it this diplomatic communication? That pattern in social media? A historical precedent from a similar conflict?
Another technique generates counterfactuals: it asks the AI, what would have had to change for a different outcome to occur?
If a scenario ended in escalation, the AI can be asked: what if one stakeholder had responded differently at a critical moment? Would the escalation have been prevented?
These techniques are especially valuable when multiple AI agents are simulating different stakeholders in a conflict — one representing a state government, another representing an opposition movement, a third representing a foreign power conducting an influence campaign.
The interaction between these agents can produce surprising outcomes that none of them individually "intended." Post-hoc explanation tools allow human analysts to trace exactly how those outcomes emerged — like rewinding a film to identify the precise moment when events took a decisive turn.
NATO has been investing heavily in exactly this kind of capability.
At a major exercise in Bydgoszcz, Poland, in June 2026, the alliance tested its ability to respond to a crisis that combined cyberattacks on power stations and hospitals with AI-generated disinformation campaigns spread across social media.
The teams succeeded in defending the physical infrastructure in two out of three scenarios — but they struggled to maintain public trust while simultaneously managing the informational attack.
This revealed a gap that simulation frameworks are specifically designed to close: the ability to prepare for multi-domain crises in advance, not by waiting for them to happen and then reacting, but by modeling how they unfold and identifying the most effective response strategies before the crisis begins.
A particularly practical output of this kind of framework is a dataset of annotated strategic interactions.
Think of this as a library of conflict case studies — from the disinformation operations that preceded the Russian invasion of Ukraine, to the election interference operations in Moldova and Romania, to AI-enabled influence campaigns targeting democratic institutions in Western Europe — each one carefully analyzed and labeled by expert panels.
Just as a medical AI learns to diagnose diseases by studying thousands of labeled medical records, a hybrid warfare simulation AI can learn to recognize and model conflict dynamics by studying thousands of labeled strategic interactions.
Dr. Bhardwaj identifies a dimension that many technical researchers overlook: the timing of events within a scenario matters as much as the events themselves. "Strategy is not a photograph — it is a sequence," he explains. "An AI that can tell you that an escalation occurred but cannot tell you at which moment the critical decision was made, and in what order the contributing factors aligned, is only half useful.
You need temporal interpretability — the ability to understand not just what happened, but when it happened and why that timing was decisive." This insight leads to one of the framework's proposed innovations: training AI systems not only to explain the causes of outcomes but to map the timeline of their development, identifying the specific windows in which intervention would have been most effective.
The framework also addresses a risk that is easy to overlook: the possibility that the human experts providing feedback to the AI are themselves being manipulated.
If an adversary understands how the feedback process works, they can try to feed misleading evaluations into the system, gradually steering the AI toward conclusions that serve adversarial interests rather than the truth.
This is why the proposed framework incorporates what researchers call strategyproof evaluation protocols — design safeguards that make it much harder for any individual evaluator, or group of evaluators, to systematically mislead the AI's training process.
The practical promise of all this work is substantial. A military planner preparing for a potential information operation would be able to use the framework to run hundreds of simulated scenarios, each one incorporating different adversarial tactics, different countermeasures, and different public responses.
The AI would not just generate those scenarios — it would explain, in language that a non-technical strategist can understand, why each scenario unfolded as it did, which countermeasures were most effective, and which cognitive vulnerabilities were most dangerous.
The human planner retains full authority; the AI provides the analytical depth and scenario range that no team of human analysts could match working alone.
Dr. Bhardwaj summarizes the governing philosophy with characteristic precision: "Human-centered AI does not mean AI that does what humans tell it to do. It means AI that enhances what humans are capable of understanding — and that is a much higher standard. In the domain of national security, the difference between those two definitions is the difference between a useful tool and a strategic liability."
In a world where adversarial AI systems are already operating at scale, manufacturing epistemic disruption across democratic societies, the development of interpretable, human-centered simulation frameworks is not an academic exercise. It is a strategic necessity.
