Why Explainable AI Matters for Financial Regulation

The financial services industry is going through a transformation that most people outside it do not fully appreciate. Machine learning models now make or heavily influence decisions about who gets a mortgage, who pays what interest rate, who gets approved for a car loan, and who gets flagged as a fraud risk. These are not edge cases or pilot programs. They are the operating reality of how American financial institutions make decisions about hundreds of millions of people today.

And most of these models are black boxes.

Not in the sense that they are intentionally opaque. In the sense that even the teams that built them often cannot fully explain why a specific individual was denied or approved. The model learned patterns from historical data that map inputs to outputs through mathematical relationships too complex for human intuition to follow. It works in aggregate. It falls apart as explanation for any individual case.

That is a problem. It is a growing legal problem, a regulatory problem, and most importantly, a fairness problem.

The adverse action notice gap

When a bank denies your mortgage application, you have a legal right to know why. The Equal Credit Opportunity Act and the Fair Credit Reporting Act both require lenders to provide adverse action notices: specific, substantive reasons for the denial. Not “your application did not meet our criteria.” Specific reasons.

This requirement was written for a world of human underwriters and rule-based scorecards. It is being applied to a world of gradient-boosted ensembles with 500 features, deep learning models trained on alternative data, and stacked model architectures where the output of one model becomes the input to another.

Share of decisioning using complex ML models by financial product category — 2024

Fraud detection84%

AML / sanctions screening71%

Credit card approval63%

Auto lending51%

Mortgage underwriting34%

Sources: McKinsey Global Banking Annual Review; OCC Model Risk Survey 2024.

The gap between what the law requires and what current model architectures can easily provide is not a small one. Regulators at the CFPB, OCC, and Federal Reserve are actively grappling with it. Banks are navigating it with varying degrees of rigor. And the individuals being denied credit are left with adverse action notices that often technically comply with the letter of the law while failing entirely to provide the substantive explanation they are entitled to.

What explainability actually means in practice

Explainability in AI is not one thing. It is a spectrum of techniques with different uses, different costs, and different audiences. Understanding this matters because “we need explainable AI” is often invoked without clarity on what level of explainability is actually needed for what purpose.

Three levels of AI explainability and their regulatory relevance

Regulatory: Medium

Global interpretability

Overall model logic — which features matter most, what patterns were learned across the full dataset.

Use: Internal model validation, model documentation for regulators, model risk management frameworks.

Regulatory: High

Local interpretability

Individual decision explanation — why was this specific applicant denied? Which features drove the outcome for this person?

Use: Adverse action notices, customer disputes, fair lending examinations, consumer remediation.

Regulatory: Very High

Counterfactual explanations

What would need to change to get a different outcome? “Your debt-to-income ratio of 46% exceeded our threshold. Reducing it below 43% would likely change the outcome.”

Use: Actionable consumer guidance, regulatory examinations, fair lending corrective action plans.

For regulatory compliance, local interpretability and counterfactual explanations are what matter most. Consumers and examiners do not need to understand the full model architecture. They need to understand specific decisions and what, concretely, could change them.

This is where SHAP values become practically important. SHAP allows you to decompose a model’s prediction into the contribution of each input feature for a specific observation. You can tell a specific applicant that their application was denied primarily because their debt-to-income ratio exceeded a threshold, that a shorter employment tenure also contributed, and that improving those two factors would likely change the outcome. That is what a substantive adverse action notice looks like.

The accuracy-explainability tradeoff is mostly a myth

One of the most persistent misunderstandings in this space is the idea that you have to choose between accurate models and explainable models. That if you want a model regulators can understand, you have to sacrifice performance. This was somewhat true ten years ago. It is largely not true now.

AUC performance vs. explainability — common credit model architectures

MODEL ARCHITECTURE

AUC

EXPLAINABILITY

Scorecard / logistic regression

0.74

Very High

Decision tree (shallow)

0.76

High

XGBoost + SHAP explanations

0.81

High

Random forest (unmodified)

0.80

Medium

Deep learning (black box)

0.83

Very Low

AUC differences between architectures are often smaller than the uncertainty in training data. Compliance and regulatory defense costs for black-box models are significantly larger.

A well-tuned logistic regression or scorecard model, with careful feature engineering, often comes within two or three AUC points of a complex ensemble on credit scoring tasks. Meanwhile, the cost of compliance, explainability, and regulatory defense for the complex model can be substantially higher. Sometimes the most accurate model accounting for total cost is the interpretable one.

Where complexity is genuinely needed, XGBoost with SHAP-based explainability has become a practical standard for financial applications precisely because it delivers near-state-of-the-art performance while remaining explainable at the individual decision level.

What I am building toward

At IBM, my work embeds explainability into the model development lifecycle from the beginning. That means feature engineering with explicit regulatory justification for every input variable. It means model selection that accounts for interpretability requirements alongside accuracy benchmarks. It means automated adverse action reason generation that produces compliant, substantive explanations at scale.

Explainability-first model development lifecycle

Define regulatory constraints upfront

Which features are legally permissible? What explanation format does the regulation require? What are the adverse action reason code standards?

Feature engineering with documented justification

Every feature must have a clear business rationale before it enters the model. No proxy variables. No features that encode protected characteristics.

Model selection with explainability as a constraint

Complexity is a cost. Choose the architecture with that cost explicitly priced in alongside accuracy benchmarks and expected regulatory defense requirements.

Validate explanations against human expert judgment

SHAP and LIME values need to make sense to underwriters and compliance officers, not just data scientists. Validate before deployment, not after complaints.

Automate adverse action reason generation at inference time

Compliant explanations produced as part of the scoring pipeline, not assembled post-hoc. At scale, post-hoc explanation is fragile and inconsistent.

The regulatory direction here is clear. The EU AI Act classifies credit scoring as high-risk AI. The CFPB has issued guidance on algorithmic adverse action notices. The OCC’s model risk guidance is increasingly explicit about explainability requirements. The United States will follow Europe’s lead on this. The question is timing, not direction.

Financial institutions that invest in explainable AI now are not just managing regulatory risk. They are building systems that are more trustworthy, more auditable, and fundamentally more fair. That alignment between compliance and ethics is not always available in this field. When it is, you should take it.

The argument for explainable AI in finance is not primarily a compliance argument. It is an argument about what it means to make consequential decisions about people’s lives responsibly. The compliance requirement is the floor. The actual goal is higher than that.

This is the first in a series on AI governance in financial services. Next: building risk intelligence systems that institutions can actually trust in production.