The Judgment Problem
AI gives lawyers and executives more capability than they have ever had. The question is whether it makes them more wise.
If you enjoy reading this, please click the 🖤 above or leave a comment - it encourages me to keep writing!
A room full of deputy general counsel gathered recently to work through something that sounds simple: how to use artificial intelligence more effectively in their day-to-day work. They were not there to debate AI policy or governance. They were not evaluating platforms. They were trying to understand a more practical and more human question. How do I get better at using this tool? And how do I know when to trust what it tells me?
It was, in that sense, less a technology session than a leadership one.
The facilitator, Alex Denniston of Factor Law, has trained more than four thousand lawyers on AI in the past year. He has watched very capable professionals make a mistake that has nothing to do with technology. They outsource their judgment too quickly. They take a confident-looking answer from a capable-looking machine and stop there.
The results have occasionally made headlines. In a courtroom. In front of a judge.
The deeper issue is not hallucination. Hallucination is a useful technical term for when an AI fabricates a date, invents a case citation, or fills a gap in its knowledge with something plausible but false. That problem is real, and it is well-documented. But what I observed in the room was something more interesting. The lawyers were not worried about obvious errors. They were worried about the errors they would not catch. The ones that look right.
That is a leadership problem, not a software problem.
There is something familiar about this concern. The stronger leaders I have worked with over the years have always known that the most dangerous advice is advice that arrives with confidence and without caveat. A junior analyst who hedges, who flags uncertainty, who says “I think but I am not sure” is more useful than one who presents a clean answer built on assumptions they did not disclose. The same logic applies here. AI is optimized, as Denniston put it, to produce responses that feel helpful and pleasing. That is not always the same thing as accurate.
What he was describing is an accountability gap. And accountability gaps in leadership are never really new.
Every significant technology adoption in the executive suite has created some version of this moment. The CFO who trusted the financial model without understanding its inputs. The general counsel who relied on outside counsel’s summary without reading the underlying document. The CEO who took the consultant’s recommendation without interrogating its assumptions. In each case, a tool or an intermediary did work that the leader then signed off on. In each case, the leader’s judgment was the last line of defense. And in each case, when things went wrong, it was the leader who was accountable.
AI compresses that cycle. It can do in seconds what used to take hours. Which means the temptation to skip the judgment step is greater, not lesser.
The discussion in that room kept returning to a concept Denniston called the verification tax. The idea is straightforward. AI saves you time in the doing, but it costs you time in the checking. When the checking is perfunctory, you risk sending out something wrong. When the checking is thorough, you sometimes wonder whether you would have been faster doing it yourself. The tax is real. The question is how to pay it efficiently.
What struck me was how directly this maps to the challenge every senior leader faces when building and delegating to a team. You hire people to extend your capacity. You give them work you would otherwise do yourself. You review what they produce. The question is never whether to review. The question is how much context to provide going in, and how intelligent to make the review coming out. Experienced leaders know that a good brief reduces errors on the front end. They know that a review that only checks for obvious problems misses the subtle ones. They know that asking someone to challenge their own work before you see it is not weakness. It is efficiency.
None of that changes with AI. It just runs faster.
What also does not change is the distinction between access and judgment. There was a time when the senior partner’s advantage in a room was knowing the law better than anyone else. Depth of knowledge was the differentiator. That advantage is eroding, not because lawyers are less knowledgeable, but because the cost of accessing knowledge has collapsed. A well-constructed prompt can now surface answers that once required hours of research. The lawyers in that room already know this. Several of them were operating at what Denniston’s framework calls level four and five. They were building their own agents, automating multi-step processes, experimenting in ways that would have seemed exotic two years ago.
But the question they kept asking was not about capability. It was about trust. How do I know this is right? How do I develop an instinct for when to push back on what the machine gives me? How do I stay accountable for work I did not do myself?
Those are not questions that AI can answer. They are leadership questions. They require the same things that leadership has always required: intellectual honesty about what you know and what you do not, the discipline to slow down when speed creates risk, and enough self-awareness to recognize when a confident answer deserves a skeptical eye.
One of the participants made a comparison that stayed with me. Training yourself to use AI well, she suggested, is not unlike training a junior associate. You invest more time at the front end. You are specific about what you want and how you want it. You do not accept the first draft as final. And over time, as the relationship matures, you develop a working understanding of where the associate is strong and where they need supervision. You calibrate your trust.
The analogy is imperfect. An associate learns. An associate pushes back. An associate tells you when they do not know something, at least if they are any good. AI will sometimes do the opposite. It will fill the silence with something that sounds authoritative.
That is the judgment problem. It does not resolve itself through better software. It resolves through leaders who understand exactly where their accountability lies, and who do not mistake confidence for competence, in a machine or in themselves.
The tools will keep improving. The responsibility will not transfer.


