Bank Regulatory Rulebooks Are Dangerously Bereft of AI Guidance

AP Photo/Lisa Poole, File

Story Stream

Why Traditional Rules Don't Fit

The model risk framework continues to rest on 3 assumptions. First, models behave consistently when given similar inputs. Second, they can be validated before deployment. Third, their decisions can be explained after the fact.

Credit scoring models, stress tests, and trading risk calculations generally fit this structure. Because they do, bank examiners have developed clear standards for validating model design, testing outputs, and assigning accountability when problems arise.

However, generative AI and autonomous agents violate all 3 assumptions. Large language models can produce different answers to the same prompt, so consistency breaks down. Pre-launch validation captures only a snapshot, because these systems keep learning after deployment. Agentic systems act on their own -- adjusting hedges, flagging counterparty exposures, reallocating capital -- based on pattern recognition that resists explanation. And when AI systems at different banks interact, no one is clearly accountable for a market-wide failure.

Regulators had a choice: force new technology into old boxes, or acknowledge that the old ways were obsolete. They chose the latter. The choice was intellectually honest. But it left a void where new guardrails belong.

The Oversight Gap Is Widening Fast

On April 7, Treasury Secretary Scott Bessent and Fed Chair Jerome Powell convened a closed-door meeting with the CEOs of the largest U.S. banks. The subject was a warning from Anthropic that its newest model, Claude Mythos, could discover and exploit security flaws across U.S. banks' core operating systems, from payments and trading to accounts and risk monitoring. An attacker with similar capability could compromise bank systems faster than security teams could detect and patch the flaws. Anthropic delayed the model's public release to keep the capability out of attackers' hands while defenses were built.

The agencies knew what Claude Mythos could do before they issued the guidance. They had options: delay, revise, or commit to an interim AI standard. They chose none. They left the void by choice.

The harm is not theoretical. The exclusion converts ambiguous coverage into formal policy, costing examiners the room they had to apply model risk standards to AI by analogy. It directs banks to use a toolkit built for a different problem, producing the appearance of governance without the substance. And it spent the rulebook vehicle that could have addressed AI directly; the next planned step is an RFI, undated and non-binding.

Meanwhile, banks are not waiting for regulatory clarity. Major institutions including JPMorgan, Bank of America, and Citigroup have moved AI from pilot programs to production systems, from chatbot customer service to agentic operations in trading and credit. The technology operates in customer-facing platforms and back-office infrastructure. It affects who gets credit, how markets function, and where operational failures could cascade.

The guidance bank examiners rely on to assess model risk explicitly does not cover these systems.

The predictable result is variation. Some supervisors will stretch existing standards through analogy, treating AI systems as if they were traditional models and imposing requirements designed for different technology. Others will invoke broad safety-and-soundness authority -- the catchall power to require prudent practices. But they will lack clear benchmarks for adequate governance. Many will focus examination resources on what they understand, conventional statistical models, and defer scrutiny of systems they cannot easily assess.

This creates perverse incentives. A bank using a standard credit scorecard faces validation requirements and examiner expectations. A bank deploying an autonomous agent to manage collateral or initiate hedges operates in a gray zone where standards remain unclear and oversight is uneven. Institutions that adopt advanced AI aggressively may face less regulatory friction than those using transparent, explainable models.

Riskier Activities Should Attract Greater Scrutiny, Not Less.

The exclusion does more than create uncertainty. It distorts incentives. In the absence of clear supervisory expectations, banks face a collective action problem: prudent firms lose business to less cautious rivals. Institutions that invest in governance and cautious deployment fall behind rivals moving faster with fewer controls. The result: the most cautious institutions lose market share to the most aggressive.

This resembles the breakdown in mortgage underwriting standards before the 2008 financial crisis. Clear, enforced rules reward prudence. When regulatory expectations dissolve, excessive risk-taking becomes advantageous: profits are privatized while systemic costs are socialized. Lenders who maintained underwriting discipline in 2005 lost business to those who didn't. By the time the system collapsed, the race to the bottom had already determined the outcome.

The current AI exclusion produces the same pattern in a different domain. Without clear standards, first-mover advantages flow to institutions willing to deploy systems that regulators don't yet know how to assess.

There is also a regulatory arbitrage problem. Banks operate under federal supervision. Many fintech lenders, payment processors, and digital asset platforms do not. A non-bank institution deploying AI for credit decisions or liquidity management may face no model risk oversight whatsoever. If prudential regulation is meant to contain risks to financial stability, the gap pushes those risks beyond reach.

The beneficiaries are not the institutions building serious AI governance. They bear costs with no regulatory credit. The beneficiaries are those deploying fastest with the least oversight.

Europe Kept AI Inside the Frame. The U.S. Did Not.

The UK's Prudential Regulation Authority includes AI in its model risk framework. The framework applies to all bank models, including AI and machine learning systems.

The European Central Bank covers AI through supervisory review, operational resilience, and the EU AI Act. The Digital Operational Resilience Act applies to banks' technology risk management and therefore reaches many AI-related systems.

Neither approach is perfect. Both keep AI subject to expectations. UK and European banks deploying AI know they remain under supervisory standards.

American banks face uncertainty about which prudential supervisory standards apply, and how.

Three Steps That Don't Require Omniscience

The relevant policy question is not whether AI in banking should be regulated, but whether the costs of waiting exceed the costs of acting with incomplete information. Premature regulation imposes real costs: rules that don't map to actual risks, innovation slowed by guidance written for older systems, and procedural compliance disconnected from risk reduction. Those costs are genuine and should not be dismissed.

Some commentators find the exclusion defensible on the grounds that AI is evolving too fast for fixed rules. But the costs of waiting compound. Each month without supervisory expectations tilts the field toward aggressive deployment over careful governance. The deeper the largest banks embed AI, the harder it becomes for regulators to impose effective controls.

Three steps would narrow the gap without requiring regulators to solve every technical challenge.

An inventory and accountability rule. To assess AI risk, supervisors need to know what's deployed. Institutions must maintain inventories of AI systems used in material financial functions, document governance and accountability structures, and preserve auditable records of automated decisions. Bank boards need regular briefings on high-impact deployments: AI systems that drive credit decisions, capital allocation, or counterparty risk. Examiners need the authority to ask basic questions about what systems are deployed, who is accountable, and how failures would be detected.

Critics will object that even basic inventory rules add compliance burden without proportionate risk reduction. The counter is that examiners cannot assess what they cannot see. Silence on these basics reads as regulatory acquiescence.

Annual public disclosure. Transparency need not reveal proprietary algorithms, but investors, counterparties, and policymakers should be able to see where autonomous systems are making decisions that affect capital allocation, market liquidity, and credit availability. Public disclosure lets market discipline supplement supervisory oversight. It also flags when risks concentrate.

A timeline that forces closure. The agencies have promised a request for information on AI; they have not committed to publish it. The agencies have also stated explicitly that the guidance is nonbinding: non-compliance will not result in supervisory criticism. A 6-month deadline on the RFI would convert a vague commitment into a binding one. The current open-ended deferral is the gap.

These steps do not require regulatory omniscience. They do not demand that supervisors understand every algorithm or predict every failure mode. But they would establish that AI in banking remains subject to some general supervisory framework rather than none.

The Cost of Waiting

An alternative is to wait and wait and wait, until failure forces action. That failure could take many forms. An algorithmic interaction could freeze liquidity in a critical market. A cascade of automated credit decisions could amplify a downturn. An operational breakdown could cut off access to financial services.

By the time such an event occurs, AI will be embedded across the financial system and competitive pressures will have locked in aggressive practices. Political pressure will then produce rules far more restrictive than the guardrails that are necessary today.

The window for moderate rules is closing.

Richard Roberts is a former Federal Reserve official and professor of economics at Monmouth University.

Comment

Show comments