AI Agents and Risk Management: Adapting to Hallucination Risks in Operations

AI agents are scaling operations — is risk management ready?

08 December, 2025

AI agents are already transforming how organizations operate: automating tasks, making decisions, and scaling processes. But as they move from copilots to production, they introduce new risks, especially hallucinations: outputs that are confidently wrong, misleading, or invented.

Current operational risk frameworks aren’t built to capture this behavior. To keep risk at acceptable levels, we need to treat hallucination as a distinct risk factor: just like human error or system failure. Only then can we design targeted controls that detect, contain, and mitigate this new class of operational risk.

In this article, I explore how existing risk frameworks should adapt to the realities of AI-driven operations. We’ll look at:

How non-determinism challenge traditional operational reliability
Why hallucination should be treated as a new operational risk factor
How to assess and treat hallucination risk using standard risk management
What preventive, detective, and corrective controls look like in practice
And why, once controlled, AI agents can deliver scale and consistency beyond human capacity

Non-determinism becomes an operational reality

IT systems have traditionally relied on deterministic behavior. Processes are designed to be predictable, repeatable, and traceable. Failures can usually be reproduced and contained because the underlying logic doesn’t change.

AI agents shift that foundation. With large language models, non-determinism is built in. The same input can yield different outputs. Behavior varies depending on training data, prompts, or context. For the first time, non-deterministic IT behavior is being embedded at scale into core operations, challenging assumptions about control, auditability, and reliability. Traditional risk controls weren’t designed for this.

To manage the shift, maybe we need to treat non-determinism as an operational characteristic, tuning the existing risk management frameworks.

Risk management frameworks in the age of AI

ISO 31000 provides a universal foundation for enterprise risk management. It defines how to integrate risk thinking across the organization. Additional regulations and frameworks add specificity, particularly around risk factors and controls. These frameworks were designed around deterministic systems and human-driven processes. They don’t account for AI agents generating variable outputs, or for hallucination as a systemic failure mode. To address these emerging risks, these models need to adapt. The structure is there; we need to extend it to cover the risks introduced by non-deterministic AI in production environments.

Understanding operational risk

Operational risk is the risk of loss resulting from inadequate or failed internal processes, people, systems, or external events. It covers what happens when the organization’s operations do not perform as intended. This includes human error, system failure, process design flaws, and external disruptions. In mature (and heavily regulated) sectors like banking, it sits alongside credit and market risk as one of the core risk management categories. The focus is on continuity, reliability, and control of operations.

Reframing hallucination as an operational risk

Until now, AI-related failures have been treated primarily as a security (integrity) problem. The assumption has been that an incorrect or fabricated output represents a data quality issue. This framing, in my opinion, is incomplete. Hallucination is not a breach, nor a manipulation of data. It is an operational failure that occurs inside a functioning system. The risk emerges not from external interference, but from how the system reasons and produces outcomes within normal operation.

Hallucination doesn't fit cleanly as “technological failure”

The Basel regulatory framework defines how operational risk is quantified and managed for capital adequacy purposes. It already includes technological failures as a recognized category. This typically covers system outages, software bugs, hardware faults, and other infrastructure issues. These failures are usually deterministic, traceable, and can be addressed with traditional IT controls. In most cases, they involve clear disruptions to service or identifiable defects in system behavior. This category has worked well for managing risk in conventional, rule-based environments.

Hallucination does not fit cleanly into this model. It occurs when an AI system generates an output that is factually wrong but delivered with high confidence. There are no system crashes, no errors, and no alerts. From an infrastructure perspective, the system has not failed. But from an operational standpoint, the output can be misleading or inaccurate. This kind of failure is probabilistic, not deterministic. Grouping hallucination under the existing category of technological failure underestimates its nature and impact. To manage it effectively, maybe it needs to be treated as a distinct risk factor with its own controls, detection methods, and thresholds.

Hallucination: Closer to human error from a risk management perspective

The more you integrate AI in operations the more you realize that it behaves more like human error. It produces outputs that are wrong, but not because the system is broken – because it lacks full understanding or grounding, much like a human making an incorrect judgment call with incomplete information. The model is functioning as intended, just as a trained employee might be following procedure but still reach the wrong conclusion. In both cases, the failure is not structural, but cognitive.

This is ironic, because AI agents are often marketed to reduce human error: improving consistency, scaling decision-making, and automating tasks previously handled by people. Yet in doing so, they introduce a new form of error that mimics the very thing they aim to eliminate. If operational risk frameworks already account for human error, there’s a case to be made that hallucination should be analyzed and mitigated under a similar lens.

Hallucination risk assessment

Let's review the risk assessment method applied to AI hallucination risk.

Probably x Impact = Inherent Risk

This establishes the baseline exposure to risk before applying any controls.

Comparing error probabilities: Human vs. AI agent

Consider a task like summarizing a report. A human analyst might misread a section or overlook a detail. Industry studies in fields such as data entry, auditing, and spreadsheet analysis suggest that error rates for routine cognitive tasks often fall between 1% and 5%, depending on task complexity and fatigue. [1] [2]

Now replace that analyst with an AI agent powered by a language model. On the surface, it performs the task instantly and fluently. But when left unchecked, the model might generate summaries that include fabricated details, misattribute causes, or omit key facts. [3]

These errors are not rare. Depending on how the model is used and whether retrieval or grounding is applied, hallucination rates can range from 5 to 30 percent or more. In some cases, the output can be entirely wrong while still appearing confident and well-structured. [4]

The key difference is detection. Human errors are often easier to catch in peer review because they are subtle or based on known behavior patterns. Hallucinations, on the other hand, often look correct unless explicitly verified. This makes them harder to catch and potentially more dangerous in automated workflows.

If operational risk assessments treat human error as a standard risk factor with controls like training, oversight, and review, then AI hallucinations must be given equivalent treatment. Not just technically, but procedurally. Otherwise, the illusion of automation may hide a higher net error rate than when humans were in the loop.

Who is responsible when an AI agent fails

In traditional operations, responsibility for errors is typically clear. The owner of the process is accountable for both the outcomes and the performance of the human workforce. If something goes wrong, it’s usually a question of whether the task was delegated correctly, supervised appropriately, or executed within expectations. Operational risk is managed close to the business, where the process and the people are aligned under the same owner.

AI agents complicate that model. Once a process is automated with a non-human agent, the business owner may no longer understand the system in detail. Responsibility begins to shift away from operations and toward IT, data, or AI platform teams. These teams manage the models, the orchestration logic, and the deployment pipeline. But they don’t own the business outcome. This disconnect creates a gap in accountability.

Without clear governance, no one owns the full picture. The business expects the AI to behave like an employee. The technical team sees it as infrastructure. In practice, both are partially right, and both are wrong. Organizations need to define new shared ownership models, where the process owner, the AI team, and control functions are aligned on how risk is managed, who approves deployment, and who intervenes when failure occurs.

Mitigating hallucination risks

Once the inherent risk of hallucination is established, the next step is treatment. In risk management terms, this means identifying which controls can reduce either the probability of occurrence, the impact of the event, or both.

Probably (after controls) x Impact (after controls) = Residual Risk

Designing the right controls for hallucination risks

Managing hallucination risk requires a shift in how we apply controls. Traditional IT controls focus on access, availability, and failure containment. They assume the system either works or it doesn’t. With AI, especially non-deterministic systems like language models, the system works but the output may still be wrong. This calls for a different control strategy, focused on the behavior and consequences of model-generated outputs. Applying classic risk management theory, we can structure these controls into three categories: preventive, detective, and corrective.

Preventive controls

Preventive controls reduce the likelihood that hallucination will occur. One of the most effective is retrieval augmented generation. RAG forces the model to generate answers based on externally retrieved, trusted information instead of relying solely on pre-trained knowledge. This grounds the response in real data and constrains the model’s freedom to invent content.

Another preventive method is context engineering. This includes careful design of prompts, structured inputs, defined task boundaries, and use of system messages to reinforce constraints. The goal is to reduce ambiguity and push the model toward precision. Unlike static programming, context becomes a dynamic risk control layer that guides the model’s behavior.

Agentic workflows introduce procedural boundaries for how AI agents operate in production environments. These workflows define task delegation, escalation rules, fallback options, and approval points. Preventive control here is achieved by limiting the scope of autonomous actions and ensuring that high-impact steps are either verified or deferred. This makes hallucination less likely to enter critical paths.

Detective controls

Detective controls are designed to identify when a hallucination has occurred. The most basic form is human review. In any workflow where accuracy matters, inserting human oversight can help catch outputs that appear fluent but are incorrect. This is especially important in early-stage deployments or high-risk processes. The second most important is proper process documentation and design. How can you create an agentic workflow for a process that nobody knows exactly how it works?

Automated output analysis is another form of detection. Models can be monitored for certain linguistic or structural patterns that indicate low confidence or speculative output. In some cases, secondary models or classifiers can be used to detect anomalies in generated content. These detectors do not eliminate hallucination, but they flag it for review before the impact spreads.

Logging and traceability are also key. Capturing model inputs, outputs, and retrieval sources allows for post hoc analysis and root cause investigation. When a hallucination leads to operational impact, logs enable fast identification of what the model said, why it said it, and what conditions triggered the error. This builds accountability and transparency into the system.

Corrective controls

Corrective controls limit the impact of hallucinations after they occur. One approach is to design workflows that allow for rollback, interruption, or downstream correction. For example, if an AI-generated summary or decision is found to be flawed, there should be a defined procedure to reverse or override it, ideally before it affects customers or compliance outcomes.

Another corrective measure is feedback integration. Systems should be built to incorporate feedback loops where users can flag incorrect outputs. This information should feed into model updates, prompt tuning, or trigger rule adjustments in the agentic layer. Over time, this reduces the recurrence of similar hallucinations.

Finally, corrective action must include policy-level adjustments. If hallucination is not treated as a one-off error but as a repeatable risk, then controls like access restrictions, retraining thresholds, or scope limitations are useful. These policies help ensure that when hallucination is detected, the organization responds systematically: not just by fixing the instance, but by reducing the future exposure.

Residual risk for AI operations

No control eliminates risk entirely. The goal is to reduce it to a level the organization is willing to accept. AI systems based on LLMs, will always produce some level of error. Even with RAG, prompt tuning, agentic constraints, and validation layers, hallucinations are not fully preventable. The issue is not whether they happen, but how frequently, under what conditions, and with what impact.

With human workers, organizations already accept a degree of error. Mistakes are expected, tracked, and often normalized based on task complexity or volume. Yet AI is held to a higher standard. A single visible hallucination can trigger distrust, even if the overall performance is better than human averages. This asymmetry creates unrealistic expectations and distorts risk decisions. We tolerate human inconsistency but expect machines to be flawless.

That mindset is counterproductive. If AI agents are going to operate at scale, we need to manage its risk with the same standard applied to humans: impact, reversibility, frequency, and control coverage. Risk criteria need to be explicit and revisited as systems evolve. Otherwise, we create a double standard: holding AI to perfection while running human operations on managed imperfection. That disconnect blocks innovation.

Conclusion: The real advantage of AI begins after control

The long-term value of AI does not come from novelty or automation alone. It comes from controlled scale. Once hallucination is managed with the right controls, AI agents can execute repetitive, high-volume tasks with consistency and speed far beyond human capacity. This unlocks a new level of operational efficiency that isn’t possible with human labor alone.

Unlike humans, AI agents do not fatigue, do not introduce variability due to distraction or stress, and can operate 24/7 across languages, time zones, and systems. When grounded and constrained properly, they can produce structured outputs, interact with APIs, and follow defined workflows without deviation. This is not theoretical. It is already happening in areas like support triage, internal reporting, and system orchestration.

The key is trust. Without proper controls, AI agents are unpredictable. But once outputs are observable, verifiable, and recoverable, their operational utility becomes clear.

AI risk management isn’t only about tuning models or putting guardrails; it’s about managing their role within operations. When hallucination is treated as part of the organization’s operational reality, AI becomes dependable infrastructure, not experimental technology.

[1] https://arxiv.org/abs/0801.0715?utm_source=chatgpt.com
[2] https://www.nrc.gov/docs/ML0425/ML042590193.pdf
[3] https://fortune.com/2025/10/07/deloitte-ai-australia-government-report-hallucinations-technology-290000-refund/
[4] https://arxiv.org/html/2502.12769v2?

Continue Reading

09 March, 2026

Identity as the target: Resurgence of Microsoft 365 credential ha...

AI agents are scaling operations — is risk management ready?