21/9/2025 ☼ not-knowing ☼ critical thinking ☼ learning ☼ education ☼ AI
tl;dr: Students now have access to LLMs that can write essays, but seem to be losing the capacity to think critically. I solve this problem by reconsidering the interaction logic between human users and the AI tools they use. I’ve developed an AI tool that inverts the usual logic of the empty, unconstrained chat box — the goal is to help users learn to think critically and do meaningmaking work that only humans can do. Initial testing of the mechanism shows students go from vague statements to sharp arguments in under two hours. This tool represents a scalable approach to critical thinking education and an alternative to current AI tools that make students passive consumers of machine-generated content.
If you’re interested in testing the tool and/or learning more about the course as it develops, please sign up here.
We’re confronting a learning and teaching conundrum. Students have access to LLMs that seem able to write essays and answer questions with remarkable sophistication — yet those same students are systematically losing the ability to think critically about what those LLMs produce. Educators are fighting a losing battle to teach critical thinking, as students increasingly rely on LLMs to do the work that should be developing their reasoning capacity.
I propose a fundamentally different approach. One that takes advantage of AI tools to do what they do well, while explicitly teaching students to do the one thing machines can’t do at all: Make meaningful judgements about value. I’ve now built a tool that takes this approach. The early results are remarkable.
Most educational AI tools treat students as passive consumers of machine-generated content. Students type prompts into ChatGPT or Claude, receive polished outputs, and submit work they barely understand and have little basis for evaluating. This creates what I call “AI’s seductive mirage” — the illusion that sophisticated output means that meaningmaking has occurred.
The result is a slowly burning educational catastrophe. Students develop dependency on these AI tools without developing the judgement needed to use them well. They lose practice making the decisions that prepare them for leadership roles where human reasoning about what matters most cannot be delegated to machines.
This is a pedagogical problem — but it’s also a civilisational one. The skills students are failing to develop (making value judgements, surfacing assumptions, understanding audiences, evaluating evidence) are precisely those which stable democratic societies require from their citizens.
Critical thinking fundamentally requires making subjective value judgements about what matters and why. This includes surfacing hidden assumptions, understanding different audiences and what might persuade them, and evaluating whether evidence is appropriate and believable in context. These skills apply whether you’re deciding where to go for dinner or whether a particular legal precedent should be upheld or overturned.
This is what I call “meaningmaking” — the distinctly human work of deciding what has value and why. Machines excel at pattern recognition, information retrieval, and following specified rules. They cannot make judgements about subjective value. Only humans can do meaningmaking, and it’s precisely these skills that separate effective leaders from those who simply execute instructions.
Meaningmaking drives human action, social change, and innovation. It’s what happens when Supreme Court justices write decisions that create or overturn precedent, when suffragette movements form to challenge voting restrictions, or when Apple’s leadership decides to build the iPhone despite widespread industry belief that consumers won’t want it.
My criticism of AI tools is, in fact, more a criticism of the superficial approach to thinking about the interaction logics that are designed into these tools. If we understand the difference between what humans must do (meaningmaking) and what machines can do better than humans, we can design this understanding into AI tools that help humans learn how to do meaningmaking better.
As it happens, I’ve built one such tool.
I’ve now built a fully functional tool that student-testers can access remotely. This test version is called CONFIDENCE INTERVAL.
It is designed to help students learn to think critically about an argument they are trying to make. (An argument is any statement that isn’t objectively true that’s supported by logical reasoning and appropriate evidence.) This tool does not produce content for students. Instead, it guides them through a structured process of eliciting aspects of their argument, mirroring what is elicited to them, and pushing them to develop more compelling, evidence-based claims. In the process, the tool creates opportunities for the user to self-model the act of developing stronger arguments.
The tool is, in essence, what I’ve come to call an “iterative scaffold.” Students respond to carefully sequenced prompts that build on their previous answers. Each step pushes them to refine their thinking, surface their assumptions, consider their audience, and strengthen their evidence. The process makes explicit what skilled critical thinkers do intuitively — but systematises it so students can learn it deliberately. (I’ve also used the same iterative scaffolding method to help leadership teams develop better corporate strategy, risk teams better understand the unknowns they must manage, innovation teams think more rigorously about the unknown terrains they must navigate, and product teams move more quickly toward product-market fit.)
The AI components in the tool act as a “Socratic mirror.” The AI systems are used to reflect back what the user has written, but explicitly reframed to enable the user to see their reasoning from a different perspective. This gives the user opportunities to see weaknesses in logic, evidence, and clarity.
The AI systems are explicitly not used as content generators. The LLMs working in the background may retrieve and present information to prompt the user’s thinking, but they don’t think for the user. Students maintain agency over all value judgements about their argument, while receiving structured support for information-processing tasks that machines handle well.
This design explicitly separates what humans must do (decide what matters and why) from what machines can help with (organise information and provide systematic prompting). The tool’s design helps students learn to leverage AI support appropriately whilst developing the critical reasoning skills that only humans can perform.
I tested this approach with first-year undergraduates facing a high-stakes academic decision: Designing and justifying a custom major. Students had been working on their proposals for months, but their initial submissions lacked focus, failed to justify why a custom major was necessary, and showed unclear value propositions.
After a single two-hour session using the structured worksheets that ultimately became CONFIDENCE INTERVAL, the transformation was extraordinary. Students emerged with proposals that were sharply focused, clearly justified, and compellingly argued. They could identify why their proposed major was valuable, how it differed from existing programmes, and what audiences would be interested in someone who completed it.
The qualitative difference was striking. In post-workshop debriefs, students reported transformative changes in their reasoning capacity: “I finally understand what it means to think critically and force myself to ask questions to refine my thinking,” and “The process is so concrete and step-by-step that it feels so manageable, yet by the end you’re so far ahead of where you started.”
This wasn’t just better writing — it was evidence of improved critical thinking. Students had learned to articulate and systematise the reasoning process, building on their own outputs to develop increasingly sophisticated arguments.
This tool also addresses a fundamental challenge in education: How to teach critical thinking at scale when expert instruction is scarce. Traditional critical thinking instruction requires intensive one-on-one mentoring that doesn’t scale. CONFIDENCE INTERVAL provides structured guidance that students can use independently, with minimal instructor oversight, while developing crucial reasoning skills.
The tool maximises teaching expertise by encoding effective pedagogical approaches into a system students can access directly. Instructors can facilitate the process for multiple students simultaneously while the tool provides the scaffolding for learning one approach to critical thinking.
This represents an alternative to the uncritical embrace of AI that’s particularly damaging in educational contexts. Rather than implicitly endorsing the replacement of human thinking with machine output, it doubles down on enhancing human reasoning capacity while maintaining clear boundaries about what humans must do themselves. Current AI tools focus on generating outputs that look human-made rather than helping users decide what outputs to make — CONFIDENCE INTERVAL inverts this logic.
CONFIDENCE INTERVAL is a test to demonstrate design principles that extend well beyond education. It shows how to build AI tools that simultaneously preserve human agency over subjective judgements while leveraging machine capabilities for appropriate support tasks.
This matters because the critical thinking skills students develop — making value judgements, surfacing assumptions, understanding audiences, evaluating evidence — are precisely what humans need to thrive in an AI-saturated world. As AI handles more routine work, sophisticated human meaningmaking becomes the rate-limiting resource for innovation, strategy, and social progress.
The tool is part of my ongoing research, supported by the Future of Life Foundation’s AI for Human Reasoning programme, into how humans can work effectively alongside AI tools whilst preserving the meaningmaking work that remains irreplaceably human. I’ve also written a research note about AI interaction logics that preserve and enhance meaningmaking, which demonstrates how to design AI systems that clearly separate human and machine capabilities.
CONFIDENCE INTERVAL is a web application ready for beta testing. The tool requires no setup and no involvement by instructors — students can access it directly over the internet.
For educators, this represents an opportunity to see immediate improvements in student critical thinking and argumentative writing. My research demonstrates that students can develop sophisticated critical thinking skills rapidly when given appropriate structure and support. CONFIDENCE INTERVAL makes this possible at scale, providing a foundation for the kind of education students need to thrive in an AI-powered world.
The choice is clear: we can continue building AI tools that make students passive consumers of machine-generated content, or we can build and invest in tools like CONFIDENCE INTERVAL that enhance the uniquely human capacity for meaningmaking.
If you’re interested in testing the tool and/or learning more about the course as it develops, please sign up here.
For the last few years, I’ve been wrestling with the practical challenges of meaning-making in our increasingly AI-saturated world, developing frameworks for how humans can work effectively alongside these powerful tools while preserving the meaning-making work that is the irreplaceably human part of the reasoning we do. I’ve published this as a short series of essays on meaning-making as a valuable but overlooked lens for understanding and using AI tools
I’ve also been working on turning discomfort into something productive. idk is the first of these tools for productive discomfort.
And I’ve spent the last 15 years investigating how organisations can succeed in uncertain times. The Uncertainty Mindset is my book about how to design organisations that thrive in uncertainty and can clearly distinguish it from risk.