Credit underwriting is a craft. A pure science can't lend.

The industry sold artificial intelligence in credit risk as better science. It is better science. That was never the problem. Lending is not a science, and a pure science cannot lend. The long version of an argument about craft, calibrated loss, and who is left to answer when the machine decides.

Computer says no. Most of us made the joke, more than once. A bank clerk, a screen, a verdict she could not explain and would not question. Little Britain first aired the sketch in 2004, and it was funny precisely because it was absurd: no institution would really hand a human being a decision that mattered and leave nobody able to say why it had been reached.

It has stopped being absurd. The industry is now building, at scale and with considerable pride, the thing the sketch was mocking. And it is calling the result progress.

This essay is an argument that the pride is misplaced, though not for the reason the sceptics usually give. The problem is not that machines are bad at credit decisions. In many respects they are extraordinarily good at them. The problem is that we have misunderstood what a credit decision is. We have treated underwriting as a science with a right answer waiting to be computed, when it has always been a craft: a judgement made under irreducible uncertainty, wrong a calculated share of the time, on purpose. Get that distinction wrong and every clever thing you then build sits on a cracked foundation.

Start with the pounds and pence

It is worth being concrete about how a bank actually makes money, because the whole argument rests on it and the abstraction usually hides the point.

Strip a bank back to its mechanics. It pays a saver a return for the use of their money. It charges a borrower a higher rate to use that same money. It lives on the gap between the two. That gap — the difference between what the borrower pays and what the saver is paid — is the margin. The saver has already been met inside it; the margin is what is left of the borrower’s interest once the saver’s return is taken out. But the margin is not profit. Not yet, and not by a distance. Out of it the bank must carry its own overheads, meet the cost of the capital it has to hold, and absorb the loans that are not repaid. Only what survives all of that is profit.

This is the fact that reframes everything that follows. A lender that is determined never to suffer a loss will lend only to the unimpeachable, and there are not enough of them to sustain a business. Lend too cautiously and the defaults do indeed vanish. But so does the volume, and with it the margin. The book starves. A spotless loan book is not a triumph of prudence; it is a symptom of a bank that has stopped doing the thing banks exist to do. A perfectly clean book is a dead bank.

It follows that a certain amount of bad debt is not a failure of the system. It is a designed feature of it. The losses are the cost of reaching the borrowers whose custom makes the margin worth having. The discipline of lending is not the elimination of loss; it is the calibration of it, finding the level of accepted default that maximises what is left after everything the margin must cover. That number is not discoverable by arithmetic alone. It is a judgement about an uncertain future, revisited as conditions change. It is, in the truest sense, a matter of craft.

Hold on to that word. Almost everything that goes wrong when we automate lending goes wrong because we forget it.

More data, more reasons to say no

Now point a modern model at the problem, and watch what it does with abundance.

The intuition most people carry is that more data makes a lender more generous, that with a richer picture, the machine can find reasons to approve the borrower a cruder system would have turned away. Sometimes it does. But the deeper tendency runs the other way. Give a model enough signal and it does not, on balance, get better at saying yes. It gets better at finding reasons to say no. Examine anyone closely enough and there is always something: a thin patch of history, an irregular income, a correlation with a cohort that defaulted three years ago. The more data you pour in, the more such reasons surface, and the more the system drifts toward refusal.

Left unchecked, this is a drift toward what one might charitably call “analysis paralysis”: a system that, armed with everything, says no to almost everyone, and cannot give an account of itself that a human would recognise as a reason. The regulator has seen a version of this problem from the other side. When the Financial Conduct Authority examined cost-of-credit disclosure in 2026, it found that simply giving people more information did not reliably help them judge better. Borrowers reached for a rough rule of thumb — that a low APR meant a low total cost — and applied it even where it did not hold; and some kinds of additional explanation had only a limited effect on their ability to compare. The lesson was not that people are careless. It was that more disclosure is no substitute for better information design. That is a narrow finding about consumer communications, but it rhymes with a wider truth: past a certain point, piling on data does not sharpen judgement, and the instinct to resolve uncertainty by adding more of it is the same instinct that, in lending, quietly manufactures refusal.

What underwriting actually is

So let us say plainly what underwriting is, because the whole argument turns on it.

Underwriting is not a science, and it never was. It is a craft: a discipline that braids together science, art, and human skill, held in tension by experience. The science narrows the field: it rules out the clearly uncreditworthy and clears the clearly sound, and it does so faster and more consistently than any human panel could. But there is always a remainder that the science cannot settle, and reading that remainder is art. It is the capacity to weigh a life that does not fit the template, to see the difference between a borrower who is genuinely risky and one who is merely illegible to the scorecard.

This is why a pure science cannot lend. Lending is not a problem with a correct answer sitting behind the data, waiting to be found. It is a judgement about an uncertain future, one that will be wrong a calculated share of the time by design, and whose quality is measured not by whether any single decision was right but by whether the whole book was calibrated well. You do not compute that. You learn it, the way any craft is learned, by doing it, getting it wrong, and being answerable for the result.

This is not an argument against the machine

None of which is a case for switching the machines off, and it is important to be clear about that, because the craft argument is easily mistaken for nostalgia.

The machine should own the obvious. The clear approvals and the clear declines, the two ends of the distribution where the science genuinely does settle the matter, are exactly where automation belongs, and it handles them at a volume and speed no human desk could approach. To insist that a person personally review every one of those would be indulgent, not prudent. Most decisions are of this kind, and handing them to the machine is the right use of it.

The danger has never been at the easy ends. It is in the middle: the borderline file that the rules cannot settle, the applicant the scorecard cannot see clearly, the case where the craft actually lives. And here the industry has been making a quiet, consequential error. The instinct, when a file is borderline, is to route it back to a human for judgement. That instinct feels responsible. It does not scale, and it never will. You cannot hand-review your way through borderline cases at the volume a modern lender operates. There are too many, arriving too fast.

So the craft cannot stay where we have always imagined it, at the desk, in the head of the underwriter deciding one file at a time. If it is to survive at scale, it has to move. It has to be encoded into the design of the system itself: the policies, the thresholds, the escalation rules, the definition of what “good” looks like. The judgement does not disappear. It relocates, from the moment of decision to the architecture of decision. From the desk to the design.

A concrete case, and the line that matters

It helps to make this concrete, because it is easy to argue in the abstract that automation is fine “with the right safeguards” and never say where the safeguard actually sits.

Consider a vendor such as Zest AI, whose published case studies describe credit unions automating a large majority of their consumer loan decisions. At one credit union, by the vendor’s own account, between 70 and 83 per cent of consumer loan decisions are automated: the model returns an approve or decline outcome with no human underwriter involved in those cases. It is worth pausing on what that figure does and does not prove, because the setting flatters it. A US credit union is a membership body, and its borrowers are not the general public. They are members, typically savers with the institution first, often for years, who have already demonstrated a willingness and an ability to manage money before they ever apply to borrow. Much of the hard underwriting has therefore been done long before the model sees the file, by the membership structure itself. Auto-deciding four in five applications from a population that has already been pre-selected for creditworthiness is a far less remarkable feat than auto-deciding four in five from the population at large. The craft has not been removed here so much as relocated upstream: the act of deciding who may become a member, and who may borrow, is itself an exercise of underwriting judgement, made by humans, in the design of the institution. The model is automating what is left after that judgement has already done most of the work.

The vendor’s marketing tends to emphasise the approval side, reaching deeper into the credit spectrum than a traditional scorecard would. But an automated decision cuts both ways, and the harder cases to think about are the declines: a borrower turned down, with no underwriter in the loop for that decision. The pitch, in effect, is that the model can take the whole file, middle included, and not merely the easy ends.

The reflex, for anyone who has taken the craft argument to heart, is to recoil: no human saw it, therefore it has been abandoned. But that reflex is wrong, and getting it right is the crux of the whole essay. Automating the middle is not the sin. A well-designed system may make better, fairer, more consistent borderline decisions than a tired human on a Friday afternoon. The sin is not that no human saw the decision. The sin is if no one can reconstruct it.

That is the line that matters. The question is never simply whether a person laid eyes on a particular decision. It is whether the firm can still rebuild the account of that decision — the data it rested on, the reasoning it followed, the policy in force at the time — and stand behind it when it is challenged, months or years later. If it can, the automation is legitimate, human eyes or not. If it cannot, then the decision has not been automated. It has been abandoned, and the decline letter is the only evidence that anyone ever chose at all.

On the hook, not in the loop

Which brings the argument to the point that matters most, and the one the industry finds least comfortable.

If the craft has moved into the design, then accountability has to move with it. The comfortable answer, the one that satisfies a committee, is to put a human “in the loop”: a person positioned beside the machine to review its decisions. But we have already seen that this does not scale. A human who is nominally reviewing thousands of decisions is reviewing none of them meaningfully. Presence of that kind is theatre; it produces the appearance of oversight and the substance of a rubber stamp with good attendance.

The honest position is harder. It is not a human in the loop, checking what cannot be checked at volume, but a human on the hook: a named individual accountable for the design of the system, able to explain why it lends and why it declines, and answerable when it is wrong. Not accountable for each decision, which is impossible, but accountable for the thing that makes the decisions, and for its capacity to give an account of itself. The craft moves from the desk to the design; and the human moves from making the call to owning the machine that makes it.

This is not a soft or notional accountability. In UK financial services it has a name and an owner. The regulatory architecture already assumes that behind any consequential decision stands a person who can be asked to justify it. That assumption is precisely what automated lending strains. The frameworks do not need to be rewritten so much as they need firms whose systems can actually answer to them. That is a design problem before it is a compliance one.

The regulator is asking the right question, one layer too shallow

The regulator is, to its credit, circling this. The FCA’s Mills Review, led by Sheldon Mills and examining the long-term impact of artificial intelligence on retail financial services out to 2030 and beyond, with credit decisioning within its scope, is due to report on 6 July 2026. That is welcome, and it is the right question. My only quarrel is that, as the debate is usually framed, it stops one layer too shallow.

The debate has settled on explainability: can the decision be explained? That is necessary, but it is not sufficient, and it is not quite the binding question. A decision can be explainable in principle — the model can produce a set of contributing factors — and still leave nobody actually answerable for it. The deeper question is not whether the decision can be explained, but who owns it: who is on the hook when the explanation has to be given, and whether the system was built so that they can give it. Explainability is a property of the model. Accountability is a property of the firm. The second is the harder thing, and it is the thing that will separate the lenders who endure from the ones who merely automated.

What winning actually looks like

The banks that win the coming decade will not be the ones with the cleverest models. The models will commoditise; everyone will have access to much the same capability, and cleverness at the level of the algorithm will cease to be a differentiator. The winners will be the ones who remembered that lending is a craft, and who built the machine to serve that craft rather than to replace it.

That means the volume goes to the machine, without apology and without a human bottleneck pretending to add assurance it cannot add. It means the judgement — the accumulated craft of knowing where to draw the line and how much loss to accept — is encoded, deliberately and legibly, into the design. And it means that behind the whole apparatus stands a named human who can be asked why it lends as it lends, and who can answer. Decision at machine speed; account of the decision for as long as anyone might reasonably ask.

The bank clerk in the sketch could not say why. That was the joke. We are now within reach of building it for real, at a scale and speed the sketch never imagined, and calling it a breakthrough. The craft is the thing that stops us. Let us not build the unknowing bank clerk.

Francis Hellawell writes on banking change, credit risk, and the architecture of accountability in regulated financial services. This is the long-form version of an argument first published in shorter form on LinkedIn. A companion essay, on agentic lending and the reconstructability of automated decisions, follows.

← Back to Insights

Credit underwriting is a craft. A pure science can’t lend.