The Single-Point Failure Problem: Why Facial Recognition Needs an Aviation Checklist Before It Can Authorize an Arrest. Jason Walker.

Picture the scenario: you are flying a helicopter at night, low altitude, and your altimeter gives you a reading that disagrees with your radar altimeter by 200 feet. You do not pick the one you prefer. You do not act on either until you resolve the discrepancy. You cross-check. You verify. You treat the unresolved conflict as a ground stop until you have independent confirmation. The reason is simple: the cost of being wrong is irreversible, and irreversibility changes the acceptable decision threshold fundamentally.

That principle is not just good airmanship. It is the foundational logic that every serious governance framework for high-stakes automated decisions should borrow directly, and almost none of them do.

Here is what most people get wrong about the facial recognition debate. They frame it as a technology accuracy problem. The conversation defaults to error rates, demographic bias in training data, and vendor accountability. All of that matters, but it treats the symptom. The actual problem is a chain-of-custody failure at the decision layer. The algorithm is not making the arrest. A human authorized an arrest warrant based on an unverified algorithmic output, and the process that surrounded that human allowed a single-point failure to propagate all the way to a person spending six months in a jail cell in a state she had never visited.

In aviation, we call this a single-point failure in a safety-critical system. Airworthy systems are designed so no single failed component causes a catastrophic outcome. The design principle is not "let's build better altimeters." It is "let's architect a system where one bad reading cannot kill anyone before another layer catches it." The difference is architectural, not technical.

Managing security and risk posture across 35 state agencies gives you a specific kind of education in this problem. Every agency I work with generates automated outputs that inform decisions. Intrusion detection flags, access anomaly scores, vulnerability scan results. Every one of those outputs has false positive rates. Every one of them requires a verification step before a human takes an action that has consequences beyond the keyboard. We built that into our operating procedures not because we distrust the tools, but because we respect what it means to act irreversibly on imperfect information.

The FAIR risk quantification framework I research operationalizes this intuition. When you model loss exposure, you are always dealing with probability distributions, not point estimates. A single output from any model sits somewhere on that distribution. Acting on it as though it is a confirmed fact collapses the uncertainty that should be governing your decision threshold. In formal risk terms, you are treating a probability as a certainty, which is exactly the kind of error that produces catastrophic tail outcomes. The question FAIR forces you to ask is: what is the loss magnitude if this estimate is wrong, and have I gathered enough independent evidence to justify accepting that residual risk?

For a deprivation of liberty, the loss magnitude is as high as it gets for any individual. You are taking years of someone's life, separating them from their children, destroying their employment, and in many cases creating collateral damage that never fully repairs. That magnitude demands a verification threshold that no single algorithmic output can clear on its own. Full stop.

So here is what I would build into any state-level AI governance framework that has teeth. Borrow the aviation concept of required independent verification, and make it structural rather than advisory.

First, classify decisions by irreversibility and loss magnitude. Arrest warrants, child removal orders, benefit terminations, license revocations: these are high-severity, high-irreversibility decisions. Any algorithmic output that feeds into one of these decisions must be explicitly labeled as unverified until a second, independent evidentiary source confirms it. The word independent matters. A visual comparison by the same officer who requested the algorithmic scan is not independent. It is a confirmation bias loop wearing a verification costume.

Second, require documented chain-of-custody for every algorithmic input to a high-stakes decision. Who ran the model? What version? What was the confidence score? What was the known false positive rate for the demographic profile of the subject? If that documentation does not exist, the output is inadmissible as a basis for the decision. This is not a technology requirement. It is a process requirement, the same kind of process discipline that governs how a flight crew documents a maintenance discrepancy before they accept an aircraft.

Third, build escalation triggers for any algorithmic output that originates outside the agency's own verified infrastructure. If a law enforcement investigator receives a facial recognition result from an unaffiliated third-party network, that result starts with a trust score of zero. It is a lead, not evidence. The investigative process begins there; it does not end there.

None of this requires banning the technology. It requires treating it the way we treat every other instrument in a high-stakes decision environment: as one input that must be corroborated before it can authorize an irreversible action.

The practical objection I hear from agency heads is capacity. They do not have the personnel to run full independent verification on every automated flag. I understand the constraint. But that objection actually proves the architectural point. If you lack the capacity to verify, you lack the capacity to act. You do not solve a verification capacity shortage by lowering your verification standards. You solve it by being selective about which automated outputs you act on at all, and by being transparent with the public about the difference between a lead and a confirmed identification.

Legislators working on AI governance bills right now are mostly focused on disclosure requirements and vendor audits. Those are necessary. They are not sufficient. The gap they leave is the procedural layer between an algorithmic output and a consequential decision. That gap is where the real harm lives.

Every pilot who has sat in a cockpit at night with conflicting instruments knows the feeling of resisting the urge to just pick one and go. The training teaches you that the discomfort of holding the uncertainty is preferable to the consequences of resolving it prematurely. That same discipline, institutionalized as policy, is what responsible AI governance in law enforcement actually looks like. Not better algorithms. Better decision architecture around the algorithms we already have.

The standard should be simple to state even if it takes work to implement: no algorithm authorizes a deprivation of liberty. A human does, after independent verification. If the process cannot support that standard, the process is not ready for the tool.

No algorithm authorizes a deprivation of liberty. A human does, after independent verification.

The Single-Point Failure Problem: Why Facial Recognition Needs an Aviation Checklist Before It Can Authorize an Arrest

Weekly writing from inside the work.

The Security Theater You Built and Can't Stop Performing

Your Organization Just Got a Code Cannon. Who Controls the Trigger?

The Accountability Vacuum at the Center of AI-Driven Security

Weekly writing from inside the work.