Loan Think

VantageScore's 'future of credit' rests on shaky math

By Tobias Peter Sissi Li October 03, 2025, 1:43 p.m. EDT 4 Min Read

VantageScore says it's building the future of credit scoring. But based on our analysis, the foundation it's building on is shaky at best.

Processing Content

Earlier this month, we published a study showing that VantageScore 4.0's claimed performance gains over Classic FICO are overstated and based on flawed methodology. In response, VantageScore released two rebuttals — neither of which directly addresses the core problems we identified.

Instead of engaging with the evidence, VantageScore doubles down on narrative. But this isn't a branding contest. It's about the integrity of mortgage risk management. And when you dig into VantageScore's analysis, the flaws are too big to ignore.

1. Apples-to-Oranges score aggregation

VantageScore's analysis relies on an apples-to-oranges comparison. Its white paper evaluates VantageScore 4.0 using a tri-merge average (the average of all three bureau scores which has not been approved), while Classic FICO is measured using the tri-merge middle (the industry standard used by the GSEs).

This matters. When we re-ran the analysis using the same aggregation method for both scores — tri-merge middle — the supposed performance advantage of VantageScore dropped from 11% to 3%.

Despite our initial critique, VantageScore continues to tout comparisons based on a score aggregation method that the GSEs have not adopted. Unless VantageScore has access to unannounced regulatory changes, this is either a methodological oversight or a deliberate attempt to cherry-pick the best possible outcome.

READ MORE NMN LOANTHINK

VantageScore 4.0's predictive power stands up to scrutiny

FICO isn't the problem. A premature two-score system is

Credit score competition reduces mortgage market risk

Pulte's tweet hands credit bureaus an unfair edge

2. Selection bias by design

VantageScore's "stress testing" is a textbook case of selection bias. The model was tested on loans with Classic FICO scores between 620 and 720, but the VantageScore 4.0 values were allowed to span the full 383–850 range. This asymmetric filtering gives VantageScore 4.0 more room to rank-order risk, while artificially compressing the Classic FICO distribution.

When we flipped the filter—holding VantageScore 4.0 to ≤720 and allowing Classic FICO its full range—the results reversed. A model that only shows an advantage when the scoring range is tilted in its favor cannot credibly claim predictive superiority.

3. Misleading headline metrics

Yet in its rebuttal, VantageScore sidestepped our core methodological concerns. Instead, it repeatedly cites a +48.5% improvement in default prediction and an 11% advantage in "head-to-head" comparisons. But both figures stem from flawed methodologies: the 48.5% from the biased stress test described above, and the 11% from the apples-to-oranges score aggregation.

When we corrected both issues, the performance advantage fell to just 3% in one metric—default capture in the bottom decile. And on the other two of VantageScore's preferred metrics - Gini coefficient and Kolmogorov–Smirnov (KS) – Classic FICO came out ahead.

As we have pointed out repeatedly, VantageScore's performance advantage is best characterized as modest, not transformational.

4. Segment-level analysis built on the same flaws

VantageScore also criticizes us for not replicating its segment-level findings (e.g., by score tier or payment amount). But these analyses suffer from the same flawed assumptions as the headline results: using a tri-merge average and applying biased filtering.

When we re-ran those breakdowns using the proper methodology, the results fell flat. In some cases, VantageScore's claimed advantage disappeared entirely. In others, Classic FICO performed better.

5. Mischaracterized "independent" studies

VantageScore claimed its results are backed by other independent studies. But two of the four studies cited appear to suffer from the same methodological flaws we identified in VantageScore's white paper. The other two studies, in fact, reinforce our findings

JPMorgan's report, for example, found only a 3% lift for VantageScore in capturing 60+ day delinquencies—identical to our findings. Kroll Bond Rating Aagency concluded that both models performed effectively, with only "slight" advantages for VantageScore in certain segments.

This isn't overwhelming evidence of superiority. It's confirmation that VantageScore's edge—if it exists at all—is modest.

6. The wrong fix for the real problem

Perhaps VantageScore's most compelling argument is that it will expand access to homeownership. But the primary barrier facing many prospective homebuyers today is not an outdated scoring system—it is a chronic shortage of supply. Simply giving more borrowers a credit score doesn't make homes more affordable. And pushing more borrowers into a tight market with looser credit can backfire, leading to higher prices and riskier loans

(For the more detailed point-by-point rebuttal VantageScore's claims, see here.)

Proceed with caution

Ultimately, this debate isn't about clinging to the past. It's about not rushing into a flawed two-score regime, especially when those flaws are hidden behind marketing spin and methodological sleight of hand.

As we noted in a recent op-ed, a rushed move to a dual-score regime, particularly one shaped by commercial interests, introduces serious challenges, including complexity in pricing through new LLPA matrices, opportunities for score shopping and model gaming, and potential misallocation of credit.

Before overhauling the mortgage credit scoring system, FHFA must insist on rigorous, transparent, and replicable analysis—not self-serving white papers or cherry-picked comparisons.

Otherwise, we risk destabilizing the very system we're trying to improve.

Tobias Peter

Co-Director, American Enterprise Institute’s Housing Center

Sissi Li

Senior Data and Analytics Manager , American Enterprise Institute