The data mess crippling SMB lending
/Sarvesh Baveja is vice president and chief risk officer at Fundbox.
Small businesses drive our economy, creating over 70% of net new jobs since 2019. Yet nearly half remain underfunded, only 52% of financing applicants are fully approved, according to the Federal Reserve. Lending to SMBs is tough, not due to one major issue, but a series of persistent challenges. Unlike consumer lending, existing systems don’t translate well to the small business landscape.
The old playbook doesn’t work for SMB lending. Despite years of fintech innovation, lenders still struggle to meet small businesses where they are. The problem isn’t a lack of data — it’s that the data is scattered, delayed, and hard to make sense of. Effective underwriting requires not just better signals, but smarter ways to combine them across messy, fragmented sources. In the sections that follow, we’ll explore what’s broken, why cash flow underwriting is only part of the solution, and what it will take to finally build models that reflect the real-world complexity of small businesses.
Core pain points
Three structural challenges continue to block progress, and data sits at the center of them all.
Data gaps: Unlike consumer lending, where most lenders report into centralized credit bureaus, small business credit data is sparse and fragmented. Many lenders don’t report at all. Others report only to select bureaus, each with its own standards and limited coverage. As a result, there is no established credit score like FICO or vantage for small businesses, making it difficult to assess a business’s credit history or repayment behavior.
Business identity: Verifying a business is not as straightforward as verifying a person. People have biometrics and government-issued IDs. Businesses, on the other hand, can be created, dissolved, renamed, or relocated with relative ease. A single person can own multiple businesses, and two businesses can share the same name. Matching a loan application to the right entity — and the right data — isn’t easy.
Diversity: Small businesses come in all shapes and flavors — from restaurants and retailers to construction firms and consultants. Even within the same industry, performance and risk profiles vary by geography, customer base, and seasonality. That diversity makes it nearly impossible to underwrite small businesses using a standardized model. What looks “healthy” for a landscaping company in Texas may look completely different for a SaaS startup in New York.
We’ll focus on the first challenge — data — and break down why it remains so hard to access, interpret, and trust in the context of SMB lending. We’ll lay out a broad canvas of the data types available today — and explore when, how, and why each can be useful.
Credit bureaus: The Illusion of a central source
In consumer lending, credit bureaus like Equifax, Experian, and TransUnion provide a centralized, trusted view of someone’s borrowing history. The system isn’t perfect, but it works well enough to support a massive lending industry.
In small business lending, there’s no equivalent.
Most people point to the Small Business Financial Exchange (SBFE) as the closest small business equivalent to the consumer credit bureaus. But it’s not a true credit bureau — it’s more like a private data co-op. Only about 150 financial institutions contribute data, and to get access, you have to be a contributor. That means most fintech lenders, alternative finance providers, and even many traditional banks are left out. As a result, coverage is thin.
Even for those with access, SBFE data has major limitations:
You can’t see which lender reported the tradeline.
There’s no concept of “soft” vs “hard” inquiries.
You can’t use it for pre-screening.
The SBFE shares its data with certified vendors — including Dun & Bradstreet, Equifax, and Experian — which add their own datasets to create business credit reports and scores. However, each vendor operates its own separate system, which leads to even more fragmentation. Here’s a quick snapshot:
Dun & Bradstreet offers Paydex (focused on vendor payment history) and SBRI (used for risk assessment).
Equifax acquired PayNet, which focuses on term loans and equipment leases.
Experian developed ‘Bizaggs’ business credit file built from lenders, suppliers, and public records.
Each of these reports can be useful, but none provides a complete picture of a business’s creditworthiness, financial health, and repayment risk. Many lenders only report to one bureau, if they report at all. SBFE is dominated by commercial cards, with limited term loans or lines of credit (LOC) trades. Vendor tradelines — like Net 30 or Net 60 payment terms — are included in some reports but not others. And unlike consumer credit, there’s no standardized format or structure.
TL;DR — There’s no single, reliable source of credit bureau data for SMBs. Lenders are often forced to stitch together multiple sources — and still come up short.
Business financials: The traditional gold standard
Before modern data tools, business financials were the foundation of SMB underwriting — and for many lenders, they still are. These include tax returns, profit and loss (P&L) statements, balance sheets, and bank statements.
This data provides the most complete view of a business’s performance. It’s detailed, sometimes audited, and grounded in established accounting principles. For larger loan amounts — especially those above $500,000 — lenders still rely heavily on these documents. SBA loans, which offer some of the most favorable terms for small businesses, are also primarily underwritten using financials.
But here’s the catch: this data is hard to process at scale.
Every business formats its financials differently. Many are out of date. And even when digitized, they still require time and manual review to validate and interpret. That makes underwriting costly — too costly for smaller loans where the economics don’t add up.
Some lenders have tried to move away from financials, especially for lower loan amounts. But for larger loans or higher-risk situations, there’s still no better source of truth.
In short, business financials are rich, accurate, and insightful — but expensive to use.
Cash flow data
As we discussed in the previous section, traditional business financials are the gold standard in underwriting — but they’re also manual, costly, and often out of date.
Cash flow underwriting offers a more scalable alternative.
At its core, cash flow underwriting means using real-time data on how money flows in and out of a business’s accounts to evaluate credit risk. It’s not a new idea — banks used to do this manually by reviewing stacks of bank statements — but what’s changed is the way the data is accessed and analyzed.
Thanks to open banking, accounting integrations, and payments APIs, this process can now be done digitally, quickly, and often without human involvement.
Alex Johnson puts it aptly: “Cash flow underwriting is the use of cash flow data to evaluate and price the risk of credit default in a manner that is compliant with applicable laws and regulations.”
The appeal is obvious: it provides many of the same insights as traditional financials, but with far less friction. And for smaller loan sizes — where the economics don’t support heavy manual underwriting — it’s often the only viable path forward.
Let’s break down the main sources of this cash flow data — and what each one brings to the table:
Payment platforms: This is the speed lane. Embedded lenders like Stripe Capital, Square Loans, and Shopify Capital sit in the flow of funds. They know what’s selling and how fast. They can take repayment automatically. But their view is narrow. They don’t see costs. They don’t see external liabilities. And they definitely don’t see if the business is running up debt elsewhere — which means this type of underwriting needs built-in safeguards. Repayment needs to be automatic, and loan amounts need to be kept conservative.
Open banking: Now we’re getting somewhere. Open banking taps directly into business bank accounts. This gives you a broader look — both sides of the ledger, to a degree. You can see inflows, outflows, trends in balances, and the timing of deposits. You still don’t get perfect categorization (it’s messy), and you don’t get intent — but it’s richer than payments alone. Think of it as a high-resolution snapshot of daily liquidity.
Open Accounting: This is where it gets deep. Platforms like QuickBooks and Xero don’t just show you movement — they show you meaning. You get customer-level granularity, AR aging, margins, payroll cycles, vendor terms, and tax status. You can tell whether revenue is recurring or project-based, whether payables are growing faster than receivables, and whether the business is barely scraping by or building operational leverage. The trade-off? Only larger or more organized SMBs have clean books. And even then, you have to trust that the data has been maintained properly.
The takeaway: Cash flow data isn’t new, but the technology to access and analyze it is. What’s changed is the cost. New tools are making it cheaper and faster to underwrite loans using real-time financial activity, helping lenders expand their credit box — especially for smaller loan amounts where traditional underwriting was never economical.
Alternate data: Filling in the blanks
Even with financials, cash flow data, and credit bureau reports, lenders are often left with an incomplete view of a small business, especially for newer businesses or those with thin files. This is where alternate data can help. It won’t replace core underwriting inputs, but it can sharpen risk assessments, catch edge cases, and add helpful context when the picture isn’t fully clear.
A few examples of where alternate data plays a role:
Online reviews and digital presence: Platforms like Google, Yelp, or Trustpilot can provide a signal on business activity and customer engagement. Matching a loan application to a business’s online footprint can help verify legitimacy and stability — or raise flags if something doesn’t add up.
Credit card transaction data: For B2C businesses, data from networks like Visa and Mastercard (or providers like Enigma) can offer a window into sales volumes and customer foot traffic — helping lenders understand real-world revenue patterns over time.
Industry-specific datasets: In regulated or specialized industries, lenders can tap public sources — like the Department of Transportation’s SAFER database (for trucking), licensing registries (for contractors), or inspection logs (for restaurants) — to add another layer of risk insight.
Putting it together: From Data to Decisions
Collecting better data is one part of the battle. The real challenge is making sense of it and building models to predict credit outcomes. It is possible but nuanced.
First, model training requires time-aligned data — that is, information that reflects what was known at the time a lending decision was made. This is difficult with certain sources like business registrations or web presence, which may only show the current state rather than historical snapshots.
Second, effective models require overlapping data from multiple sources to understand how signals interact. For example, what does it mean when strong cash flow trends, improving Yelp reviews, and a solid business credit score all appear together? Most lenders lack sufficient historical overlap across these datasets to train models that can detect those patterns.
Third, traditional modeling techniques like logistic regression struggle to handle 10 or more partially overlapping datasets with varying formats, update frequencies, and signal strengths. Lenders need to develop creative ensembling techniques, but these require infrastructure, expertise, and clean data pipelines—resources most lenders don’t yet have.
Summing Up
Small business lending doesn’t suffer from a lack of data; it suffers from data that’s scattered, delayed, and hard to trust. New technologies are making it faster and cheaper to pull signals from cash flow, accounting, and payment systems. That’s unlocking a new wave of underwriting — one built for smaller loans, real-time decisions, and significantly broader access to capital.