compliance July 29, 2025

Alternative Data and Fair Lending: What You Need to Know

By Omar Hassan

Balance and fairness concept — abstract geometric

Every lender considering alternative data for credit decisioning eventually runs into the same question from their compliance team: does using this data create fair lending exposure? The question is right. The framing is sometimes wrong. The goal isn't to ask "does alternative data create risk?" — it's to ask "which alternative data, used how, creates what specific type of risk under which regulatory framework?"

This article walks through the fair lending analysis that should accompany any alternative data credit model: which laws apply, what the specific compliance failure modes look like, and how to design a model that expands credit access to underserved populations without creating new discriminatory patterns.

The statutory framework: ECOA and Regulation B

The Equal Credit Opportunity Act (ECOA) and its implementing regulation, Regulation B (12 CFR Part 1002), are the primary federal fair lending laws governing credit decisioning. They prohibit discrimination against credit applicants based on race, color, religion, national origin, sex, marital status, age, or receipt of income from public assistance programs. The prohibition applies to any aspect of a credit transaction — not just the final approve/decline decision, but also terms, pricing, and the amount of credit extended.

Two distinct legal theories of discrimination apply under ECOA:

Disparate treatment occurs when a lender treats an applicant differently based on a protected class characteristic — intentionally or through facially neutral policies that, in their application, treat protected class members differently. A policy that explicitly considers national origin in a credit decision is facially illegal. A policy that doesn't mention national origin but results in different treatment for applicants based on criteria that function as national origin proxies may constitute disparate treatment depending on the facts.

Disparate impact occurs when a facially neutral policy has a statistically significant adverse effect on a protected class, regardless of intent, unless the lender can demonstrate that the policy is a business necessity and no less discriminatory alternative exists. The disparate impact theory under ECOA is well-established through regulatory guidance and enforcement history even where specific statutory language has been debated.

Alternative data models implicate both theories. Getting this right matters for building a credit model that can actually be deployed, not just demonstrated in a sandbox.

The proxy variable problem: what not to use

The most direct fair lending risk in alternative data is proxy variables — features that are correlated with protected class membership and that therefore function as indirect discriminators even without explicit demographic inputs.

The classic example in the alternative data context: zip code or geography as a model feature. Home location is highly correlated with race due to historical patterns of residential segregation. A model that penalizes applicants from certain zip codes will, in most US markets, disproportionately penalize applicants of color — even if the zip code correlates legitimately with financial stress indicators that are themselves predictive. That's the core of the problem: the variable can be predictive AND disparately impactful, and predictiveness does not immunize against fair lending liability. The business necessity test requires demonstrating that the predictive benefit cannot be achieved through a less discriminatory alternative feature.

Features we explicitly exclude from Lendiro's model for this reason: geographic identifiers below the state level, employer name or employer type (which can correlate with national origin), telecom carrier (which can be a proxy for income level correlated with protected class), and any spending category that is heavily concentrated in a particular demographic group.

We're not saying geography has no predictive value. We are saying that using geographic features without a rigorous disparate impact analysis and a defensible business necessity argument creates a compliance posture that most lenders should not accept.

Cash-flow features and the disparate impact analysis

Bank transaction-based features — income consistency, balance behavior, obligation payment regularity — are not neutral with respect to protected class membership. Lower-income populations, which are disproportionately composed of people of color due to documented historical inequality, systematically show lower income levels and higher income variability. Features that penalize income variability will disproportionately impact this population.

This means a well-designed cash-flow model does not eliminate disparate impact exposure by virtue of using "financial behavior" rather than "demographic data." The disparate impact analysis is required regardless of how the features are constructed.

The defensible path: conduct an adverse impact ratio analysis on each model feature and on the overall model output, segmented by race and other protected class attributes where proxy data is available. Document the business necessity of each feature contributing to adverse impact — typically by demonstrating its predictive contribution in lift analysis. Evaluate whether a less disparately impactful feature can substitute for each high-impact feature.

For income variability specifically: the analysis needs to distinguish between structural variability (gig workers, seasonal employees) and behavioral variability (irregular financial management). Applying the same penalty to both groups may produce disparate impact on protected class members who are concentrated in gig and seasonal employment. Feature design that separates income type (salaried vs. variable) and evaluates variability within type is both more predictive and less disparately impactful.

CFPB guidance and the "credit-building" framing

The CFPB has issued guidance and requests for information on alternative data in credit decisions — most notably the 2017 Request for Information on Consumer Access to Financial Records and subsequent guidance on responsible use of alternative data. The agency's posture has been cautiously supportive of alternative data that expands credit access to underserved populations, while emphasizing the applicability of existing fair lending obligations.

The CFPB's stated view: alternative data can help lenders make more accurate decisions on thin-file applicants, which is consistent with fair lending goals of expanding credit access. But "expanding access" is not a compliance defense. A model that approves more thin-file applicants overall while producing adverse impact on a protected class sub-segment within that population does not earn fair lending credit for the aggregate expansion.

What this means practically: segment your analysis. Don't evaluate disparate impact only at the overall model level. Evaluate it within the thin-file sub-population, the credit-invisible sub-population, and any other sub-populations you're specifically trying to serve. Adverse impact in the sub-population you're targeting is a harder compliance problem than adverse impact in the overall population.

Adverse action: reason codes for alternative data decisions

When a lender declines a credit application or takes an adverse action on credit terms, Regulation B requires providing the principal reasons for the action. For bureau-based decisions, standard adverse action reason codes are well-established. For alternative data decisions, the reason code framework is less mature and requires more deliberate design.

The Regulation B standard for adverse action notices applies regardless of what data source the model uses. If a cash-flow model declines an applicant primarily because their income stability score is below threshold, the adverse action notice must convey the substantive reason in a way that helps the applicant understand what factor drove the decision and, implicitly, what they might be able to change to improve their outcome in the future.

A reason code of "insufficient financial data" does not satisfy this standard if the actual driver was income variability, not data absence. "Irregular income pattern" is more specific and more useful. "Insufficient account history in the reviewed period" is appropriate if the lookback window was too short. The reason codes must map to the actual model factors, not to generic descriptions of the data type.

For lenders building alternative data models: build your adverse action reason code framework before deployment, not after. Map each model output band to the features that are driving the score at that band, and translate those features into plain-language reason codes that satisfy the Regulation B specificity requirement. Reason code quality is a compliance audit item, and "we were still working out the reason codes" is not a viable response to an examiner.

Model governance and documentation requirements

The OCC, Federal Reserve, and FDIC guidance on model risk management (SR 11-7 / OCC 2011-12 for supervised institutions) requires thorough documentation of model design, validation, and ongoing monitoring for any model used in credit decision-making. Alternative data models are not exempt.

The documentation expectations for an alternative data credit model include: feature selection rationale and empirical support for predictiveness, validation methodology and performance metrics on holdout populations, disparate impact analysis documentation, adverse action reason code framework, ongoing monitoring plan for model drift and disparate impact drift over time, and change management process for model updates.

For a growing lender using a third-party model API (such as the Lendiro platform), the model governance obligation does not transfer entirely to the vendor. The lender retains responsibility for validating that the third-party model performs as represented, conducting independent disparate impact analysis on their applicant population, and maintaining documentation of their own due diligence. Vendor-provided validation documentation is a starting point, not a compliance substitute.

The access-expansion and compliance objectives are aligned, not opposed

The premise that fair lending compliance and thin-file credit expansion are in tension is wrong. They point in the same direction when the alternative data model is properly designed.

The credit-invisible population in the US is disproportionately composed of people of color, recent immigrants, and lower-income individuals. Systematically declining these applicants without evaluating alternative creditworthiness data creates adverse impact under ECOA by any reasonable definition — it just creates that impact through the absence of a decision rather than through a discriminatory model. The better compliance posture is to build models that accurately distinguish creditworthy from uncreditworthy applicants within the thin-file population, using features that are both predictive and designed with disparate impact in mind.

That is exactly the design brief we built Lendiro around. Credit expansion into underserved populations and fair lending compliance should reinforce each other when the model is built with both objectives explicitly in scope from the start.

Alternative Data and Fair Lending: What You Need to Know

The statutory framework: ECOA and Regulation B

The proxy variable problem: what not to use

Cash-flow features and the disparate impact analysis

CFPB guidance and the "credit-building" framing

Adverse action: reason codes for alternative data decisions

Model governance and documentation requirements

The access-expansion and compliance objectives are aligned, not opposed

More from the blog

The 45 Million: Understanding Credit Invisibility in America

Why BNPL Platforms Have a Thin-File Default Problem

Cash-Flow Underwriting: What It Is and Why It Works