All insights

Risk Quantification

The Agent Has the Keys. Do You Know Which Doors It Can Open?

AI agents aren't scary because they're smart. They're scary because nobody quantified what they can reach. Here's how to fix that.

Jason Walker

.6 min read

Picture this: your security team flags an AI agent in a procurement workflow. It reads vendor data, pulls external threat intelligence, and sends follow-up emails autonomously. The CISO in the room says "that sounds risky." The business owner says "it saves forty hours a week." Everyone nods, nobody has a number, and the meeting ends with a vague commitment to "add some guardrails."

That is not a security decision. That is a coin flip with documentation.

I run enterprise cybersecurity across a large, complex state government. We have dozens of agencies, hundreds of thousands of devices, and a threat surface that would make most private-sector security leaders lose sleep. When AI agents started appearing in our environment, I watched the same pattern play out that I see everywhere: fear-based rejection on one side, reckless enthusiasm on the other, and almost nothing in the middle that resembled actual risk analysis.

The problem is not AI agents. The problem is that we have not built the habit of asking the right question. The right question is not "is this agent safe?" The right question is: "what can this agent reach, what can it do with that reach, and what does a bad outcome actually cost?"

Those are quantifiable questions. Let me show you what that looks like.

The reach problem

Every agent lives somewhere. It has a host environment, a set of tools it can call, data it can read, and actions it can take. Before you touch a risk framework, you have to map that geography honestly.

The FAIR Institute calls these your crown jewels. I call it the blast radius. Before you evaluate any agent, you need to answer two things: what data does this agent touch, and can it change state in the real world?

An agent that reads masked internal records and returns a ranked list is a fundamentally different risk profile than an agent that reads external web data, accesses your vendor database, and autonomously sends email on your behalf. The first one has a limited blast radius. The second one is wired directly into your legal exposure, your partner relationships, and your reputational standing, and if it hallucinates or gets prompt-injected, that damage lands in the real world before any human sees it.

This is why the "Rule of Two" heuristic from Meta's AI safety work is one of the most useful thinking tools I have encountered. The principle is simple: an agent operating in a single session should satisfy no more than two of these three properties: processing untrustworthy external inputs, accessing sensitive internal data, and changing state or communicating externally. Hit all three simultaneously without a human checkpoint, and your vulnerability to a prompt injection attack approaches certainty. Not probability. Certainty.

Aviation safety culture built its entire incident-reduction framework around this idea. You do not let a single point of failure propagate unchecked through the whole system. You build the interruption into the architecture. AI agents need the same thinking, and the good news is it is much simpler to implement than most engineers make it sound.

Fear is not a risk analysis

Here is what I keep seeing in government and enterprise environments alike: security teams produce a list of things that could go wrong with an AI agent, present it to leadership, and call it a risk assessment. It is not. It is a threat inventory. Those are completely different things.

A threat inventory tells you what bad things could happen. A risk analysis tells you how likely those bad things are, how much they would cost, and whether the cost of preventing them is justified by the reduction in expected loss.

When you run the numbers on an agentic workflow, the conversation changes completely. If a TPRM agent scans hundreds of vendors daily and your estimated hallucination or injection rate produces a handful of malformed communications per week, and each one carries legal exposure in the range of fifty to two hundred fifty thousand dollars, you can now compute an annualized loss expectancy that a board member can evaluate against the cost of a human-in-the-loop control. That control might cost eighty thousand dollars to implement. The math is not hard. The math is the point.

Without the math, you get one of two bad outcomes: the agent gets blocked because someone told a scary story, and you lose the productivity gain. Or the agent gets deployed because someone told an optimistic story, and you absorb an avoidable loss. Neither outcome is defensible. Both of them happen every day.

Autonomy is a spectrum, not a switch

Not all agents carry the same risk, and treating them uniformly is how you get bad policy. A simple reflex agent that executes a single conditional rule is almost trivially auditable. A utility-based agent that selects actions to maximize a reward function across a complex environment is genuinely hard to predict, especially when it has access to tools that touch production systems.

The risk management posture for each of those should be different, and the controls you apply should be proportionate to the actual quantified exposure, not to how futuristic the agent sounds.

When I evaluate an AI deployment, I want to know the agent type before I know anything else. A goal-based agent reading from an internal CRM and producing a ranked list is not in the same category as a learning agent that evolves its behavior and has write access to external APIs. Conflating them in your policy is like applying the same access controls to a read-only reporting tool and a financial transaction system. The stakes are different. The controls should be different.

What good governance actually looks like

Good AI agent governance does three things well.

First, it maps the environment before evaluating the agent. You cannot assess risk in the abstract. You need to know the data, the tools, the access, and the downstream blast radius.

Second, it quantifies the exposure in dollars. Not in severity levels, not in red-yellow-green traffic lights. Dollars. Because dollars convert to budget conversations, and budget conversations produce decisions.

Third, it matches controls to the specific loss paths it identified, not to a generic "AI security checklist." If your highest-probability loss comes from tool call hijacking, you address execution guardrails. If it comes from internal misuse of a data-access agent, you address role-based access and PII masking at the data layer. The treatment follows the risk, not the other way around.

The discipline that separates effective risk management from security theater is the same one that separates profitable decision-making from noise: you have to be willing to put a number on what you are worried about, defend that number, and update it when you are wrong.

AI agents are not inherently dangerous. Ungoverned autonomy with reach into sensitive systems is dangerous. Quantify the reach. Map the blast radius. Apply proportionate controls. Then make the call.

That is the job.

Keep reading

Weekly writing from inside the work.

Practitioner-researcher essays four times a week. No spam, unsubscribe in one click.

Subscribe

Weekly writing from inside the work.

Field observations and framework critiques from a practitioner-researcher running cybersecurity at scale. AI in operations, FAIR risk research, and the leadership patterns that hold both together. No spam. Unsubscribe in one click.