The Gap Between "Exists" and "Connected"
In complex AI agent systems, the most dangerous failures are not in what breaks. They are in what was never wired.
Jason Walker
State CISO, Florida
I asked a simple question in a working session a few weeks back: "Does /deep-research get invoked automatically when a research task comes in?"
The answer was no.
That took a minute to sit with. /deep-research is a skill I built months ago. It launches Claude.ai's built-in Research mode, which runs live multi-source searches across the web, synthesizes results with citations, and produces far richer output than a standard agent doing the same task. I tested it. It worked. I was proud of it. And then I filed it in the skill registry and moved on.
For months, every research task that came through my AI agent system quietly routed to lesser tools. Not because /deep-research was broken. Not because it was missing. It was sitting in the registry, functional and ready, pointing nowhere. Nothing called it. No workflow triggered it. The skill existed. It was not connected.
That gap: between "this capability exists" and "this capability gets invoked" is where silent failures live.
What Makes This Type of Failure So Dangerous
The obvious failure modes in complex systems are loud. A broken API call returns a 500. A missing environment variable causes an immediate crash. A syntax error refuses to compile. These announce themselves. They demand attention. They get fixed.
Silent failures are different. They leave the system running. The system looks healthy. Work continues. Outputs are produced. Nothing trips an alarm. The only evidence that something is wrong is what never happened: the research that never got the best tool, the cybersecurity analysis that never reached the right agent, the routing rule that never became policy.
I found four examples of this pattern in a single session of auditing my own system.
Example one: /deep-research, already described above. Months of research tasks routed to inferior tools while the best tool sat dormant in the registry. Discovered only because I thought to ask the question.
Example two: A routing table in my pmo-charter skill referenced an agent named ciso-advisory. The actual file on disk is ciso-advisor.md. One character. Every cybersecurity domain project that ran through the pmo workflow would have attempted to invoke an agent that did not exist. The invocation would fail, silently, or fall through to an error that I would only see at runtime. The skill was built. The agent was built. The wire between them was wrong.
Example three: When I designed the multi-domain coordinator for my planning workflow, the initial version embedded the coordinator logic inside the pmo-charter skill itself. Functionally, it worked. For that one skill. But it was trapped there. No other skill could call it. It had no dedicated system prompt, so judgment-heavy synthesis tasks had no persistent instructions guiding them. If I wanted to change the coordinator's behavior, I had to touch pmo-charter. The capability existed. It was just buried where the rest of the system could not reach it.
Example four: After I wired /deep-research into five workflows, I realized the decision logic about when to use deep-research versus the standard research team existed only in my memory and in the conversation we had just finished. It was not written anywhere in the system. No skill file referenced it. No routing standard documented it. Any future agent, skill, or session would default to the research team because that was the only documented path. The routing rule existed in my head. It was not connected to anything that would actually make the decision.
Four gaps. Four different shapes. One root cause.
This Is Not an AI Problem. It's a System Problem.
I want to be precise here because it's easy to file this under "LLM quirks" and move on. That would be a mistake.
The pattern I'm describing is a property of any complex system where components are built in layers, by different people or at different times, with no single person holding the full picture.
Take software. A library gets imported at the top of the file. The import works. The tests pass. But the function inside the library that handles a specific edge case is never actually called in production code. The capability exists. It is not connected. You find out when the edge case hits in production.
Take organizational process. A policy is written, reviewed, approved, and published in the employee handbook. The language is correct. The guidance is sound. But no one added it to the onboarding checklist. New employees never see it in their first 90 days. The policy exists. It is not connected to the moment when following it actually matters.
Take infrastructure. An alert is configured for a critical service. Thresholds are set correctly. The alert fires. It fires to an email inbox that used to belong to an engineer who left the team three months ago. The monitoring exists. It is not connected to anyone who can act on it.
Take an org chart. A role is created to own a function. It is documented with clear responsibilities. But no process, meeting, or reporting line ever puts that role in a position to exercise its responsibilities. The accountability exists. It is not connected to any actual decision or workflow.
The pattern is always the same. The builder verifies that the component works. They rarely verify that the component gets called.
The Verification Discipline
Here is the question that changes how you audit a system.
Not: "does this work?"
Instead: "what calls this?"
Those are not the same question. The first is about intrinsic function. The second is about connection. A component can pass every test and still be completely isolated from the system it is supposed to serve.
When I audit any capability I have built, I now map two directions explicitly.
Outbound: what does this do, what does it produce, where does the output go?
Inbound: what invokes this, under what conditions, and is that invocation path explicitly written somewhere or just assumed?
If the inbound answer is "I think it gets called when..." or "it should be triggered by...", then the component is built but not connected. The difference between "should be triggered" and "is triggered" is a sentence in a skill file, a line in a routing table, a condition in a workflow. It is small. It is also the entire difference between a capability that helps you and one that sits dormant for months.
I now treat "document the inbound path" as a mandatory step whenever I add a capability to any system I run. Not optional documentation. A gate. If I cannot answer the inbound question, the capability is not done. It is a draft.
Why The Discovery Mechanism Matters
Here is the part that should make any engineer uncomfortable.
All four gaps I described were found the same way: I asked a question.
Not through a system health check. Not through a log review. Not through automated testing. Not through a skill registry audit. A human asked "does X happen automatically?" and the answer revealed a gap.
The system had no mechanism for flagging unconnected capabilities. A skill can exist in the registry without any pointer to what invokes it. An agent file can sit in a directory without any connection to the routing tables that are supposed to call it. From the system's perspective, everything is fine. From the user's perspective, the capability never fires.
This means the only defense I had, for months, was whether I happened to think to ask the right question at the right time. That is a fragile defense. I got lucky that the question came up organically during a session where I was already thinking about system architecture. Most gaps never get asked about. They just stay dark.
The design implication is uncomfortable but direct. Systems should make their invocation paths legible. Not just their capabilities. A skill registry that lists what each skill does is a catalog of things that might be connected. A skill registry that also lists what triggers each skill, under what conditions, and through which routing path, is a map of a connected system.
The difference is one additional field per entry. It is not a lot of overhead. It is the difference between a catalog and a wiring diagram.
What I Actually Changed
After this session, here is what I did.
I updated the skill registry to require an "invoked by" field for every entry. Every skill now documents what triggers it: which workflow, which agent, which user command, which condition. If that field is empty or marked "none," that is a signal worth investigating.
I moved the routing decision for deep-research versus the research team out of my memory and into the routing standard for the research cluster. It is now a documented rule that any agent can read.
I fixed the ciso-advisory typo.
I extracted the coordinator logic from pmo-charter into a standalone agent with its own system prompt, its own file, and its own entry in the agent registry.
Small changes. Each one took less than ten minutes. The compound effect is a system where connection is visible, not assumed. Where I can look at a capability and follow the chain: who calls this, under what conditions, and where does the output go.
That chain should be readable without needing to ask me to explain it. If it isn't, the system is not built. It is half-built.
The Question That Matters
If you are building any kind of AI agent system, multi-step workflow, organizational process, or complex infrastructure, I want to leave you with one diagnostic question to run on your own setup.
Pick any capability you have built. Anything you are proud of. Something you tested, verified, and considered complete.
Now ask: what calls this?
Not what should call it. Not what you intended to call it. What actually, explicitly, documentably calls it right now.
If you can trace that chain without hesitation, the capability is connected.
If you pause, or qualify, or say "I think it gets invoked when..." then you have found the gap. The capability exists. It is not connected.
That is where silent failures live. And unlike the loud failures, no alarm will tell you they are there.