Mizan · how it works

The forward-affordability layer, explained.

Everything behind the demo: what the thesis is and what holds it up, how the engine reaches a number, how it helps a lender lend with more confidence — right-sizing and better-fitting plans instead of all-or-nothing calls — and what this prototype deliberately does not claim.

1

The thesis: compliant isn’t affordable

In Saudi, a BNPL purchase can clear everything the regulator checks — it can sit inside SAMA’s debt-burden limits, stay under the SAR 10,000 BNPL cap, and carry a clean credit-bureau (SIMAH) file — and still tip the borrower past what they can actually carry a month or two later. Compliant and affordable are not the same test.

A point-in-time ratio misses two things by construction:

  • Timing. It reads one instant. When several “pay-in-4” plans overlap, the heaviest month can sit just ahead of the snapshot — the stack peaks, then eases as plans roll off. A ratio taken today can pass while next month breaks.
  • Volatility. It divides by a single income figure — one month, or an average. For a salaried borrower that’s fair; for a variable-income earner, the dependable month sits well below the average one, so the same ratio overstates real capacity.

Mizan adds a forward, volatility-aware carrying-capacity check on top of the rules — it never replaces them. Affordable means every month of the next six stays under what the borrower can carry after a volatility haircut and real fixed essentials. When the full purchase breaches a near month, it returns the largest amount that clears, or a down payment that brings it back inside.

What holds the thesis up — and what doesn’t. The mechanism is logically forced: timing and volatility are real, and a snapshot of a ratio cannot see them. The regulator already acts on the same risk — SAMA made SIMAH reporting mandatory for BNPL and enforces a DBR and an aggregate cap precisely because cross-provider stacking and over-extension are recognised hazards. And the direction is evidenced, not just argued — the published findings below all point the same way. What this prototype does not have is KSA-specific, proprietary default data validating Mizan’s exact stack-timing features at scale; that is what production re-calibration provides. The synthetic backtest proves only the measurement pipeline (§5).

The evidence — published research and the live KSA market — points the same way:

  • Cash-flow signal is real. FinRegLab’s independent study found cash-flow variables predictive of credit risk — at least as strong as traditional credit scores, and improving prediction among borrowers those scores rate as similar risk. FinRegLab, 2019
  • The cross-provider stack is real. The CFPB found 63% of BNPL borrowers carried simultaneous loans and 32% held them across different firms — invisible to any single provider — and BNPL borrowers were more than twice as likely to be delinquent on another account (18% vs 7%). CFPB, 2023
  • Income volatility is pervasive. The JPMorgan Chase Institute found the median individual’s month-to-month income swung by about a third (≈36%) over a year — so reading a single month, or an average, overstates the income a volatile earner can actually depend on. JPMorgan Chase Institute
  • And it’s already live in the Kingdom. Tamara — a Saudi BNPL — already reads consented bank cash-flow through SAMA Open Banking (via Lean) to verify income and affordability, lifting approval rates ~32% overall and ~60% for the non-salaried, thin-file customers it otherwise couldn’t assess. The rails Mizan runs on are live and the market is adopting them; the forward, stack-timed view is what Mizan adds on top. The Fintech Times
2

How the engine decides — the LLM never does the math

It starts with the borrower’s consent. At checkout, the shopper connects their bank through SAMA Open Banking — a read-only, time-limited, revocable authorisation — and the lender pulls the cash-flow the steps below run on. Nothing is uploaded by hand. Most applications auto-decide; for the exceptions that escalate, that same consented data is exactly what lands on the underwriter’s console.

From there the decision is split so every number is auditable by construction:

  • The model tags. The agent reads the raw, bilingual transaction records (from the Open Banking feed) and classifies each line — category, direction, and for a BNPL debit the entity-resolved provider, with a confidence. It classifies; it never counts or sums. Because that job is narrow, it is model-agnostic — any frontier LLM can do the read-and-tag; this build runs Claude.
  • The engine computes. Deterministic arithmetic over those tags does the rest: group recurring debits into plans, infer each installment and how many payments likely remain, derive income, stability and the fixed-essentials floor — then run the forward simulation and the decision. Every figure the verdict rests on traces to a tool call and is replayable from the tags.

Carrying capacity = volatility-haircut income − fixed essentials − a safety buffer. Essentials are non-discretionary fixed obligations only (rent, utilities, committed transfers) — discretionary spend flexes inside the buffer, so counting it as a hard floor would wrongly push borderline borrowers toward decline. A purchase is affordable iff no month of the horizon exceeds capacity.

It’s agentic, not a form-filler. The agent runs this whole chain at application — tag and entity-resolve the transactions, reconstruct the stack, simulate forward — and it reasons about whether more data is worth requesting: it can confirm employment and insured salary via GOSI, rent via Ejar, or pull a longer history — but it asks for one only when the verdict actually hinges on it (usually it doesn’t, and it decides on the consented cash-flow alone). Clear cases auto-decide; when it’s genuinely unsure it escalates to a human, with the full reasoning recorded.

That’s why the underwriter opens a case and the work is already done — and why every run is replayable. The value isn’t a black-box score: a reviewer (or a regulator) can watch the agent reason step by step, and every figure traces to a tool call. Six hours of analyst work — read the messy bilingual feed, find the cross-provider stack, project it forward — compressed to seconds, and fully auditable.

Read it as a method, not a calibrated scorecard. The engine is deliberately transparent and coarse: monthly buckets (real pay-in-4 plans are often biweekly), a linear volatility haircut, a heuristic remaining-count, interest-free assumed. It demonstrates carrying-capacity reasoning; it is not a production credit model. Read a cliff as “within about a month or two,” not an exact date — production re-calibrates the coefficients on real data.
3

The commercial case — lose less, lend with confidence

Mizan adds a forward-affordability check on top of the rules, so it is deliberately more careful than a point-in-time ratio — it will sometimes right-size a loan the rules would have approved in full. That is the commercial point, not a cost: every forward cliff caught at origination is a default that never reaches the book, and the slice it defers is margin, not principal. A book that doesn’t blow up is one you can keep lending into with confidence.

  • Lose less — that’s the engine. Every default starts as an approval. Catching a forward cliff at origination prevents a heavy, hard-to-recover loss on unsecured credit; the cost is only the margin on a deferred slice. That asymmetry is the whole economic case — quantified in the proof.
  • A right-sized yes beats a flat no. When a purchase doesn’t fit, the alternative to turning the customer away is a counter-offer or down-payment they can carry. The sale still closes, at an amount that holds — affordability keeps the relationship instead of ending it.
  • Plans that fit complete and repeat. An installment sized to real carrying capacity is one the borrower actually finishes. Completion and repeat purchases beat a charged-off balance — and a borrower who isn’t pushed over the edge keeps buying.
  • Room to grow. Fewer forward defaults means more headroom under your loss budget and risk limits — capacity to lend more, confidently, into a book that performs.
  • Governance & exposure. A standardised, auditable affordability layer is a responsible-lending and PDPL posture — and, for BNPL players heading toward public markets, a board-and-regulator-grade control rather than per-analyst judgment.
4

Beyond the regulatory floor — and beyond good current practice

The sharpest objection isn’t from the regulation; it’s from a good risk team: “we already do cash-flow affordability with buffers.” Fair — and the honest concession is that amortization scheduling and manual buffers exist today. What isn’t standard is the combination, as one embeddable, auditable layer:

  • Volatility-aware. A haircut sized to income regularity, not a flat percentage buffer.
  • Stack-timed. The full cross-provider stack, reconstructed from cash flow and simulated forward month by month — not a single provider’s view, and ahead of SIMAH’s lagged snapshot.
  • Standardised & auditable. One policy with replayable, tool-sourced figures — not per-underwriter variance an examiner can’t reconstruct.
  • Decision-agnostic. The lender sets the thresholds and owns the call; Mizan supplies the forward view and the reasons, escalating what a human should bless.
5

The measured validation

A reproducible backtest on a labelled synthetic population establishes that the measurement pipeline survives noise — from messy, mis-tagged transactions it recovers the true hidden stack and reaches the same forward verdict it would on the truth. It does not prove the thesis: the oracle there is Mizan’s own forward model on the true stack, so the catch-rate measures reconstruction fidelity, not real-default prediction. It also shows precision alongside recall and argues the cost-asymmetry in money.

See the measured backtest →

6

Honest scope & caveats

  • Synthetic. No real borrowers or bank data. The cases are authored to be realistic; the population is stress-weighted to test the engine.
  • Coarse model. A method demonstration (monthly buckets, linear haircut, heuristic remaining-count), not a calibrated production scorecard.
  • Mocked identity. Nafath sign-in is mocked; there is no real auth or data store. Decisions persist in the browser only.
  • Sources. SAMA’s Rules for Regulating BNPL Companies — a SAR 10,000 cap on a consumer’s total outstanding BNPL (Art. 22) and mandatory credit-bureau registration, with consent (Art. 19(3)) — and the Responsible Lending Principles they require, which cap debt service at 33.33% of salary for salary-deducted finance and up to 45% of total income for other consumer credit. BNPL is the latter, so the prototype models the point-in-time gate at the 45%-of-income consumer limit. Plus the carrying-capacity mechanism and the external evidence in §1 (FinRegLab, CFPB, JPMorgan Chase Institute; Tamara × Lean for the live KSA market). KSA-specific default-outcome calibration is out of scope.
The production path. Going live swaps the mocked sources for licensed ones — bank data via SAMA Open Banking (a licensed AISP such as Lean or Tarabut) in place of the fixtures, SIMAH for the bureau snapshot, Nafath for identity, GOSI/Ejar for the optional employment and rent confirmations, and the agent running live per application (here it’s a captured run, replayable). The engine then re-calibrates its coefficients on real KSA default outcomes. The thesis, the method and the auditability are unchanged — only the inputs become real.