Agent vs. Automaton:Who Is Responsible When AI Fails the Law?

The distinction between an automaton and an agent is operationally useful before it becomes legally necessary. An automaton executes a defined rule. Its behavior is deterministic, its scope is bounded, and its failures trace back to a human decision: the person who wrote the rule, approved it, or deployed it without adequate testing. An agent pursues a goal. Its behavior is variable, context-dependent, and often non-transparent. The same inputs do not reliably produce the same outputs.

In financial market infrastructure, both types of systems exist simultaneously. Automated order routing, pre-trade risk controls, and clearing instructions are automations. When they fail, ownership is traceable. AI agents operating in surveillance, compliance screening, or position risk assessment are different. They infer. They generalize. They produce outputs that cannot always be reconstructed from the inputs.

The governance frameworks regulating these environments were not built for that difference.

Where Accountability Breaks Down

When an automated system makes a wrong decision, there is a clear failure chain: specification, implementation, testing, or oversight. Each node has an owner.

When an AI agent makes a wrong decision, the failure chain fragments. The model may have generalized incorrectly from training data. The data itself may have been biased, incomplete, or mislabeled. The threshold for action may have been set by someone who did not understand the model’s behavior at the margin. The human reviewer may have accepted the model output without independent analysis. The governance process may have failed to define what a wrong decision looks like before deployment.

Each failure mode can be distributed across teams, vendors, and time. That distribution is structural. It is how AI systems are built and deployed inside large institutions. When accountability is diffuse, it is functionally absent.

The Regulatory Problem

FINMA and DORA were written to govern systems with identifiable decision chains. They require that firms demonstrate who made a decision, on what basis, and with what authorization. They require that automated systems be tested, documented, and audited. They impose liability where firms use technology to execute regulated activities.

Neither framework cleanly resolves what happens when an AI agent makes a consequential decision and the reasoning cannot be fully reconstructed. DORA’s ICT risk requirements demand that firms identify critical functions, assess dependencies, and test resilience. They do not specify how to handle a model whose behavior changes between versions without a formal release event, or where a surveillance flag is acted on without the flagging logic being retained in the audit record.

FINMA’s operational risk guidance requires control over outsourced and automated processes. An AI agent with variable outputs is not the same risk category as a deterministic outsourced system. The frameworks are being applied to a problem they were not designed for. That gap is the firm’s risk to manage, not the regulator’s to absorb.

How Firms Are Managing the Gap

Human in the loop

The first response is to require human review of every AI-generated output. This preserves accountability formally but eliminates much of the operational value. If an analyst must review every surveillance flag, credit recommendation, or compliance screen, the AI has become an expensive pre-filter. The reviewer’s cognitive load may increase if the model generates more false positives than the rule-based system it replaced. Accountability is retained. Effectiveness is reduced.

Output as recommendation

The second response is to define all AI output as a recommendation rather than a decision. The firm argues that no automated system is making a binding choice. This holds until a regulator asks what the expected override rate is, what criteria the reviewer is applying, and whether the review is substantive or nominal. If reviewers routinely accept model outputs without independent analysis, the governance structure is a form without substance.

What Accountability Actually Requires

Accountability in AI systems is not a policy document. It is an operational architecture. Before an AI agent is deployed in a regulated workflow, the following must be resolved.

A named decision owner. Not the model vendor, not the data science team as a function, not the technology organization. A named individual or role with authority and liability for each consequential output the model produces.

An auditable decision record. The model input, the model version, the threshold applied, and the action taken must be logged in retrievable, tamper-evident form. This is the minimum required to respond to a regulator or reconstruct a failure.

A formal model change process. A model that retrains, or whose parameters change, is a different system from the one that was approved. Each version must pass through the same approval and documentation process as a software release.

A pre-defined failure mode. The firm must define, before deployment, what a model failure looks like in operational terms: what a false positive triggers, what the consequence of a false negative is, and who decides when the model output is overridden.

The Accountability Gap Is a Design Choice

When a firm cannot answer who is responsible for an AI agent’s decision, that is not a gap in the regulatory framework. It is a gap in the firm’s governance design. The framework requires accountability. The firm has chosen to deploy a system without engineering it in.

The question of who is responsible when AI fails the law is answerable. It requires that someone accept responsibility before deployment. Not after the model misbehaves, not after the regulator asks, and not after the harm is visible. Before.

Back to writing