Mizan · how it works

The forward-affordability layer, explained.

Everything behind the demo: what the thesis is and what holds it up, how the engine reaches a number, how it helps a lender lend with more confidence — right-sizing and better-fitting plans instead of all-or-nothing calls — and what this prototype deliberately does not claim.

The thesis: compliant isn’t affordable

In Saudi, a BNPL purchase can clear everything the regulator checks — it can sit inside SAMA’s debt-burden limits, stay under the SAR 10,000 BNPL cap, and carry a clean credit-bureau (SIMAH) file — and still tip the borrower past what they can actually carry a month or two later. Compliant and affordable are not the same test.

A point-in-time ratio misses two things by construction:

Timing. It reads one instant. When several “pay-in-4” plans overlap, the heaviest month can sit just ahead of the snapshot — the stack peaks, then eases as plans roll off. A ratio taken today can pass while next month breaks.
Volatility. It divides by a single income figure — one month, or an average. For a salaried borrower that’s fair; for a variable-income earner, the dependable month sits well below the average one, so the same ratio overstates real capacity.

Mizan adds a forward, volatility-aware carrying-capacity check on top of the rules — it never replaces them. Affordable means every month of the next six stays under what the borrower can carry after a volatility haircut and real fixed essentials. When the full purchase breaches a near month, it returns the largest amount that clears, or a down payment that brings it back inside.

What holds the thesis up — and what doesn’t. The mechanism is logically forced: timing and volatility are real, and a snapshot of a ratio cannot see them. The regulator already acts on the same risk — SAMA made SIMAH reporting mandatory for BNPL and enforces a DBR and an aggregate cap precisely because cross-provider stacking and over-extension are recognised hazards. And the direction is evidenced, not just argued — the published findings below all point the same way. What this prototype does not have is KSA-specific, proprietary default data validating Mizan’s exact stack-timing features at scale; that is what production re-calibration provides. The synthetic backtest proves only the measurement pipeline (§5).

The evidence — published research and the live KSA market — points the same way:

Cash-flow signal is real. FinRegLab’s independent study found cash-flow variables predictive of credit risk — at least as strong as traditional credit scores, and improving prediction among borrowers those scores rate as similar risk. FinRegLab, 2019 ↗
The cross-provider stack is real. The CFPB found 63% of BNPL borrowers carried simultaneous loans and 32% held them across different firms — invisible to any single provider — and BNPL borrowers were more than twice as likely to be delinquent on another account (18% vs 7%). CFPB, 2023 ↗
Income volatility is pervasive. The JPMorgan Chase Institute found the median individual’s month-to-month income swung by about a third (≈36%) over a year — so reading a single month, or an average, overstates the income a volatile earner can actually depend on. JPMorgan Chase Institute ↗
And it’s already live in the Kingdom. Tamara — a Saudi BNPL — already reads consented bank cash-flow through SAMA Open Banking (via Lean) to verify income and affordability, lifting approval rates ~32% overall and ~60% for the non-salaried, thin-file customers it otherwise couldn’t assess. The rails Mizan runs on are live and the market is adopting them; the forward, stack-timed view is what Mizan adds on top. The Fintech Times ↗

How the engine decides — the LLM never does the math

It starts with the borrower’s consent. At checkout, the shopper connects their bank through SAMA Open Banking — a read-only, time-limited, revocable authorisation — and the lender pulls the cash-flow the steps below run on. Nothing is uploaded by hand. Most applications auto-decide; for the exceptions that escalate, that same consented data is exactly what lands on the underwriter’s console.

From there the decision is split so every number is auditable by construction:

The model tags. The agent reads the raw, bilingual transaction records (from the Open Banking feed) and classifies each line — category, direction, and for a BNPL debit the entity-resolved provider, with a confidence. It classifies; it never counts or sums. Because that job is narrow, it is model-agnostic — any frontier LLM can do the read-and-tag; this build runs Claude.
The engine computes. Deterministic arithmetic over those tags does the rest: group recurring debits into plans, infer each installment and how many payments likely remain, derive income, stability and the fixed-essentials floor — then run the forward simulation and the decision. Every figure the verdict rests on traces to a tool call and is replayable from the tags.

Carrying capacity = volatility-haircut income − fixed essentials − a safety buffer. Essentials are non-discretionary fixed obligations only (rent, utilities, committed transfers) — discretionary spend flexes inside the buffer, so counting it as a hard floor would wrongly push borderline borrowers toward decline. A purchase is affordable iff no month of the horizon exceeds capacity.

It’s agentic, not a form-filler. The agent runs this whole chain at application — tag and entity-resolve the transactions, reconstruct the stack, simulate forward — and it reasons about whether more data is worth requesting: it can confirm employment and insured salary via GOSI, rent via Ejar, or pull a longer history — but it asks for one only when the verdict actually hinges on it (usually it doesn’t, and it decides on the consented cash-flow alone). Clear cases auto-decide; when it’s genuinely unsure it escalates to a human, with the full reasoning recorded.

That’s why the underwriter opens a case and the work is already done — and why every run is replayable. The value isn’t a black-box score: a reviewer (or a regulator) can watch the agent reason step by step, and every figure traces to a tool call. Six hours of analyst work — read the messy bilingual feed, find the cross-provider stack, project it forward — compressed to seconds, and fully auditable.

Read it as a method, not a calibrated scorecard. The engine is deliberately transparent and coarse: monthly buckets (real pay-in-4 plans are often biweekly), a linear volatility haircut, a heuristic remaining-count, interest-free assumed. It demonstrates carrying-capacity reasoning; it is not a production credit model. Read a cliff as “within about a month or two,” not an exact date — production re-calibrates the coefficients on real data.

The commercial case — lose less, lend with confidence

Mizan adds a forward-affordability check on top of the rules, so it is deliberately more careful than a point-in-time ratio — it will sometimes right-size a loan the rules would have approved in full. That is the commercial point, not a cost: every forward cliff caught at origination is a default that never reaches the book, and the slice it defers is margin, not principal. A book that doesn’t blow up is one you can keep lending into with confidence.

Lose less — that’s the engine. Every default starts as an approval. Catching a forward cliff at origination prevents a heavy, hard-to-recover loss on unsecured credit; the cost is only the margin on a deferred slice. That asymmetry is the whole economic case — quantified in the proof.
A right-sized yes beats a flat no. When a purchase doesn’t fit, the alternative to turning the customer away is a counter-offer or down-payment they can carry. The sale still closes, at an amount that holds — affordability keeps the relationship instead of ending it.
Plans that fit complete and repeat. An installment sized to real carrying capacity is one the borrower actually finishes. Completion and repeat purchases beat a charged-off balance — and a borrower who isn’t pushed over the edge keeps buying.
Room to grow. Fewer forward defaults means more headroom under your loss budget and risk limits — capacity to lend more, confidently, into a book that performs.
Governance & exposure. A standardised, auditable affordability layer is a responsible-lending and PDPL posture — and, for BNPL players heading toward public markets, a board-and-regulator-grade control rather than per-analyst judgment.

Beyond the regulatory floor — and beyond good current practice

The sharpest objection isn’t from the regulation; it’s from a good risk team: “we already do cash-flow affordability with buffers.” Fair — and the honest concession is that amortization scheduling and manual buffers exist today. What isn’t standard is the combination, as one embeddable, auditable layer:

Volatility-aware. A haircut sized to income regularity, not a flat percentage buffer.
Stack-timed. The full cross-provider stack, reconstructed from cash flow and simulated forward month by month — not a single provider’s view, and ahead of SIMAH’s lagged snapshot.
Standardised & auditable. One policy with replayable, tool-sourced figures — not per-underwriter variance an examiner can’t reconstruct.
Decision-agnostic. The lender sets the thresholds and owns the call; Mizan supplies the forward view and the reasons, escalating what a human should bless.

The measured validation

A reproducible backtest on a labelled synthetic population establishes that the measurement pipeline survives noise — from messy, mis-tagged transactions it recovers the true hidden stack and reaches the same forward verdict it would on the truth. It does not prove the thesis: the oracle there is Mizan’s own forward model on the true stack, so the catch-rate measures reconstruction fidelity, not real-default prediction. It also shows precision alongside recall and argues the cost-asymmetry in money.

See the measured backtest →

Honest scope & caveats

Synthetic. No real borrowers or bank data. The cases are authored to be realistic; the population is stress-weighted to test the engine.
Coarse model. A method demonstration (monthly buckets, linear haircut, heuristic remaining-count), not a calibrated production scorecard.
Mocked identity. Nafath sign-in is mocked; there is no real auth or data store. Decisions persist in the browser only.
Sources. SAMA’s Rules for Regulating BNPL Companies — a SAR 10,000 cap on a consumer’s total outstanding BNPL (Art. 22) and mandatory credit-bureau registration, with consent (Art. 19(3)) — and the Responsible Lending Principles they require, which cap debt service at 33.33% of salary for salary-deducted finance and up to 45% of total income for other consumer credit. BNPL is the latter, so the prototype models the point-in-time gate at the 45%-of-income consumer limit. Plus the carrying-capacity mechanism and the external evidence in §1 (FinRegLab, CFPB, JPMorgan Chase Institute; Tamara × Lean for the live KSA market). KSA-specific default-outcome calibration is out of scope.

The production path. Going live swaps the mocked sources for licensed ones — bank data via SAMA Open Banking (a licensed AISP such as Lean or Tarabut) in place of the fixtures, SIMAH for the bureau snapshot, Nafath for identity, GOSI/Ejar for the optional employment and rent confirmations, and the agent running live per application (here it’s a captured run, replayable). The engine then re-calibrates its coefficients on real KSA default outcomes. The thesis, the method and the auditability are unchanged — only the inputs become real.