The Thinking We Can’t Outsource

The phrase critical thinking is everywhere in the conversation about artificial intelligence and education. It appears in nearly every policy paper, every keynote, every school district’s new AI policy — usually as the reassuring counterweight to a long list of worries. Yes, the argument goes, AI will change everything, but we will protect critical thinking. The trouble is that the phrase is almost never defined in ways that a teacher, a curriculum designer, a parent, or a student could actually practice on a Tuesday morning.

This essay is an attempt to repair that gap. It describes, with practical depth, what critical thinking in partnership with AI really means, how it is learned, how it is practiced inside actual AI interactions, and how it should be taught. It is written for the people who will shape how education evolves in the coming decade: school and district leaders, university administrators, curriculum developers for secondary and higher education, philanthropic funders, and the policymakers and foundations who choose where to point the resources.

The essay starts with my own experience, in working with AI, with people using AI in business, and with my past experience introducing personal computing to educators around the world. But given that distance, it more importantly comes from articles, research, and work by those much more genuinely practiced on the topic than me. The sources I found useful are at the end.

The stakes are just surfacing but appear to be very clear. A 2025 study from the MIT Media Lab, led by Nataliya Kosmyna and her colleagues, used EEG to monitor fifty-four participants as they wrote essays in three conditions — with an AI assistant, with a search engine, and with no tools at all. Brain connectivity systematically weakened as external support increased. The AI-assisted group showed the lowest neural engagement during writing, produced the most homogeneous output, and — most strikingly — when later asked to write without assistance, still exhibited reduced engagement. The researchers named this effect cognitive debt. That’s another term you will hear quite often, for good reason.

A separate Microsoft Research and Carnegie Mellon study presented at CHI 2025 surveyed three hundred and nineteen knowledge workers across nine hundred and thirty-six real-world AI use cases and found that the more users trusted AI, the less they critically engaged with its output. And in a 2025 paper in the journal Societies, Michael Gerlich found that frequent AI use correlated with weaker critical thinking scores, particularly among younger users, mediated by what cognitive scientists call cognitive offloading.

I have personally experienced this, as in my business I waded through AI slop pieces, blog posts, marketing materials, and internal “evaluations” where the hallucinations and exaggerations and prompt-pleasing content caused me to spend hours in correction and editing.

I have also read much from accomplished writers, who for example look for signals that AI has helped with content — hashes, phrases, certain logical approaches. Maybe you can find your favorite signals here! This essay itself was produced in “partnership” with AI. AI gathered the research in minutes. In similar fashion, most of my work in business was produced in partnership with my colleagues who drafted or edited content with me.

So no, I am not arguing against the use of AI. I learned a ton and honed my views by critically reading the “AI slop.” Included here are my findings about this partnership — how we use AI, so far. It will change.

In every one of the studies referenced here, the users who retained — and even sharpened — their cognitive engagement were those who treated AI as an object of scrutiny rather than a source of answers. The question for educators is not whether students will use AI. They will. To go further, they should. The question is whether they will use it in a way that builds their minds or quietly borrows against them.

❧

Part I · What critical thinking actually is in the age of AI

For most of the twentieth century, the cleanest working definition of critical thinking I could find came from the 1990 Delphi Report, an American Philosophical Association consensus project led by Peter Facione that polled forty-six scholars from philosophy, education, and the social sciences. They identified six core cognitive skills: interpretation, analysis, evaluation, inference, explanation, and self-regulation. A skilled critical thinker, in their framing, was someone who could make sense of information, break it into components, weigh its credibility, draw defensible conclusions, show their reasoning, and monitor their own thinking for error.

Every one of those six skills is still required in the age of AI. But each now operates on a new kind of surface — one that produces confident, articulate, well-formatted outputs at a speed and scale no individual human can match. That changes what it means to practice critical thinking in seven concrete ways, and those elements are what educators and curriculum designers should be teaching toward.

1.Framing and goal-setting

When AI systems can generate plausible answers to almost any question, the quality of thinking is largely determined before the question is asked. What is the actual problem? What input or context is important to the topic? What would count as a good answer? What am I trying to decide, understand, or create? A student who can frame a question well will extract vastly more value from an AI partner than one who cannot — and framing is not something the AI can do for you, because it depends on your judgment about what matters. There is a whole new industry these days around “prompting,” which is very much about framing and goal-setting. So does the student copy someone else’s prompts and run with them, or edit and develop their own?

2.Source and output evaluation

Students have been taught, for a generation, to evaluate sources: Is this a reputable publisher? Does this author have expertise? Is this study peer-reviewed? A language model is a new kind of source — it may cite no author, has no expertise in the traditional sense, and produces its outputs by probabilistic pattern completion over a training set. The appropriate epistemic stance is different. Treat AI output as a hypothesis to be tested, not a conclusion to be accepted.

3.Evidence verification

This is where most cognitive debt accumulates, because it is the place where effort is most easily skipped. An AI can produce a quote, a statistic, a citation, or a historical claim in seconds. The critical thinker asks: Is this real? Where does it come from? When a student receives a confident-sounding reference from an AI, the habit that needs to be built is the reflexive follow-through — click the link, find the source, check the context. This is the muscle that atrophies fastest when it is not exercised.

4.Assumption auditing

Every AI response is built on unstated assumptions about the user’s intent, the framing of the question, the relevance of certain facts, and the irrelevance of others. A critical thinker learns to reverse-engineer those assumptions and ask whether they hold. What would change if one of them were wrong?

5.Alternative-perspective seeking

AI systems trained on the consensus of human text tend to converge on a modal answer — a plausible, middle-of-the-road response that does not offend. Practicing critical thinking means deliberately probing for dissent. What is the strongest argument against this conclusion? Who disagrees, and why? What would a thoughtful minority view look like? The AI is an excellent partner for this, but only if the user knows to ask. Honestly, it is difficult for me to spend any time on this, unless I just flat-out suspect a lie or a gross exaggeration.

6.Metacognitive awareness

Metacognition is thinking about your own thinking. In an AI partnership, it is the practice of noticing when you are accepting an answer because it is correct versus because it is comfortable, confident-sounding, or convenient. Helen Lee Bouygues, founder of the Paris-based Reboot Foundation, describes critical thinking as “like speaking a second language” — a skill that has to be deliberately practiced, and one that feels uncomfortable because challenging your own biases is genuinely uncomfortable. Metacognition is what makes that discomfort productive. Very few people these days, in my experience, are good at this — at least when it comes to certain social and political topics.

7.Intellectual humility and calibrated trust

The final element is a disposition more than a skill: a willingness to say I don’t know, to update when evidence warrants, and to hold the AI’s output and your own intuitions to the same standard. The Microsoft and Carnegie Mellon research found that users with higher confidence in AI and lower self-confidence offloaded the most cognitive work. The inverse pattern — calibrated trust in yourself, calibrated skepticism toward the tool — is the posture that preserves human judgment.

These seven elements are not a sequential pipeline. They are a repertoire, practiced together, each informing the others. A student who can frame a question well will evaluate the answer more carefully. A student who audits assumptions will notice when verification is needed. A student with metacognitive awareness will know when to slow down. Taken together, they describe what it means to be the human in the loop — a phrase Wharton professor Ethan Mollick uses in his book Co-Intelligence to describe the posture educators and workers should adopt with AI: present, critical, and in charge of the judgment calls.

❧

Part II · How critical thinking is learned

Critical thinking is not a trait. It is not a gift some people are born with and others are not. It is a learned capacity, and it is learned the way second languages and musical instruments are learned — through deliberate, repeated, often uncomfortable practice, under the guidance of someone further along the path.

There are three things we know about how it develops. First, it is domain-sensitive. A person who reasons carefully about baseball statistics does not automatically reason carefully about epidemiology. Cognitive scientist Daniel Willingham has argued persuasively that critical thinking is not a free-floating skill that can be bolted onto any subject matter — it is always thinking about something, and the something matters. This is why “teaching critical thinking” as a standalone elective usually fails. It has to be taught inside subjects, with the domain knowledge that gives the thinking something to grip.

Second, it is developed through effortful retrieval and friction. The psychological literature on learning is clear that difficulty, within a supportive structure, is what produces durable skill. The MIT cognitive debt study is a striking demonstration of what happens when the friction is removed: when the AI does the hard work, the brain does not. The implication is not that AI should be withheld, but that the friction has to be put back somewhere — in the framing, in the verification, in the defense of conclusions, in the oral examination.

Third, it is modeled. Students learn to think critically by watching adults they respect do it out loud, and then imitating. A teacher who pauses mid-lesson to say “I’m not sure that’s right — let me check my source” is teaching critical thinking. A parent who challenges a headline at the dinner table and then updates their view when corrected is teaching critical thinking. A professor who changes her mind in the middle of a seminar because a student made a strong argument is teaching critical thinking. The demonstration is often more powerful than the instruction.

Critical thinking is not something you’re born with. You are taught, and you need to exercise it. It’s like speaking a second language. — Helen Lee Bouygues, Reboot Foundation

❧

Part III · How to practice critical thinking in AI interactions

This is the part of the conversation that is usually missing. We tell students to “think critically about AI” without giving them a concrete repertoire of things to do. What follows is a practical set of practices — each one grounded in the seven elements above — that can be taught, assigned, and graded. They work for a high-school junior working on a research paper, a nursing student reviewing a case, or a product manager using AI to scope a launch. They also form a natural scaffold for curriculum design.

1.Form your view first.

Before asking the AI anything, write down your own initial take on the question in two or three sentences. Then ask the AI. Then compare. This two-pass method is the single most protective habit against cognitive debt. It forces your own cognition to show up before the AI’s does. Teachers can build this directly into assignments.

2.Ask the AI to argue the opposite.

After receiving an answer, prompt the AI to give the strongest case against its own conclusion. A good prompt is: “What are the three most serious objections to the position you just took, and who would make them?” This is the single best use of an AI partner for critical thinking, and it is the closest thing to having a Socratic opponent on demand.

3.Require citations and verify them.

Never accept a factual claim from an AI without asking for a source, and never accept a source without checking it. This is non-negotiable. AI systems still hallucinate citations, and a meaningful percentage of the links they produce do not go where they say they go. Verification should be treated as a graded part of the assignment, not a courtesy step.

4.Red-team your own prompt.

Ask yourself: How am I framing this question, and what am I assuming? Then deliberately reframe it and ask again. If an AI gives you a different answer when you phrase the question differently, that is important information about the stability of the conclusion.

5.Use the AI as a tutor, not an answer machine.

Instead of asking the AI to produce the essay, ask it to help you think through the essay. Good prompts: “Ask me five questions that will sharpen my argument.” “Where is my reasoning weakest?” “What would I need to know to have a stronger view?” The tutor posture is where most of the learning happens.

6.Interrogate the confident-sounding answer.

AI output is uniformly confident. That confidence is a stylistic artifact, not evidence of correctness. Train yourself, and your students, to pause longer on smooth, declarative outputs than on hedged ones. The sentences that are hardest to question are the ones most worth questioning.

7.Name your own uncertainty.

After working with an AI, articulate what you still don’t know, what you’re not sure about, and what you would need to find out to be confident. This is the metacognitive finish that turns a transaction into learning.

8.Defend your conclusion to a human.

The final test is oral. If you cannot explain, out loud, to a thoughtful person, why you believe what you believe — citing the evidence, addressing the objections, and acknowledging the uncertainty — then the thinking has not been done, no matter how polished the output looks. This is the single most powerful assessment technique available to educators, and it is robust against every AI tool that exists.

❧

Part IV · How to teach these skills

The practices above can be taught at every educational level, but the scaffolding has to be age-appropriate. The broader the adoption, the more this becomes a curriculum question rather than a classroom one — and the more it becomes a matter for district leaders, administrators, textbook publishers, accreditors, and philanthropic funders.

Elementary grades: unplugged first, then guided

Young children should learn the underlying habits of mind — asking why, checking, comparing, noticing when something sounds off — before encountering generative AI at all. Classic media-literacy exercises, Socratic conversation in circle time, simple source-checking games, and even the deliberate introduction of incorrect information for children to catch, all build the foundational instincts. When AI is introduced in upper elementary grades, it should be used visibly and briefly as a class activity, with the teacher modeling verification aloud: “It said this. Let’s see if that’s right.” In my own experience, breaking through a child’s own stated views — “that doesn’t sound right” — is hard enough for a young child to hear and bear, without AI being the source.

Middle and secondary grades: partnered practice

It has been a long time since I have been a teacher. I remember the many challenges. But this is where the practices in Part III become a teachable curriculum. Middle and high school students can handle the two-pass method, prompt red-teaming, citation verification, and oral defense. Assessments should increasingly measure the thinking, not the artifact. A research paper that a student cannot defend in a five-minute oral exam is worth very little; a research paper that a student can defend — including the parts where AI helped — is worth a great deal. Oral examinations, process portfolios, in-class timed work, and collaborative defenses are the assessment forms that remain meaningful in an AI era.

Higher education: domain-embedded critical thinking

Universities should stop treating critical thinking as a general-education requirement satisfied by a single course. Willingham’s point about domain-sensitivity is decisive here: critical thinking in medicine is not the same as critical thinking in law, history, or engineering, and the transfer between them is weaker than most curriculum committees assume. Every major should identify the specific forms of reasoning its field requires, map them onto the seven elements and eight practices, and build them into courses explicitly. AI ethics and AI literacy should be universal requirements, not reserved for computer science students — because future nurses, lawyers, managers, engineers, mechanics, marketers, and policymakers will all be managing AI-enabled systems within a few years of graduation.

Workforce and continuing education

Adults already in the workforce are the least supported population on this dimension, and they are the ones currently making the biggest decisions about how AI gets used. Employer-sponsored AI literacy programs — short, practical, grounded in the employee’s actual daily work — are one of the highest-leverage investments a company or a philanthropic funder can make. The best of these programs do not teach people how to use AI tools; they teach people how to think while using AI tools, which is a genuinely different curriculum. I have wanted to add critical editing of AI content as a big part of personnel evaluations.

Teachers and faculty

None of this works without sustained investment in the people doing the teaching. A one-day workshop is not a professional development program. Educators need time, collegial structure, and support to rebuild their own assessment practices, redesign their assignments, and become fluent themselves with the tools their students are already using. The research is clear that teachers who use AI thoughtfully in their own work are the ones who can teach students to do the same. But the tools at the moment may be primarily DIY.

❧

Part V · A call to educators, philanthropists, and curriculum leaders

The agenda that follows from this framework is specific and fundable. It is not abstract.

First, fund pedagogy, not just tools. Most of the philanthropic and public money flowing into AI and education is going into software. Far less is going into the harder, slower work of redesigning what teachers teach and how learning is assessed. The ratio could be reversed. The classrooms that will produce capable critical thinkers are the ones where the curriculum, the assessments, and the teacher training have all been rebuilt around the seven elements — not the ones with the shiniest AI dashboards, but student achievement dashboards, as described next.

It has become obvious that we need to move “back” to the development and adoption of assessments that AI cannot pass for the student. Oral exams, in-class defenses, process portfolios, live problem-solving, and collaborative work are all robust against generative AI in a way that take-home essays and multiple-choice tests are not. These assessment forms exist; they are well-validated; they are underused because they are expensive in teacher time. That expense is exactly the kind of cost philanthropy can promote and support.

Third, invest in teacher capacity at a scale that matches the stakes. Every credible analysis of what schools need most right now names the same bottleneck: teachers have been given a generational technological shift and very little support in adapting to it. Multi-year fellowships, paid curriculum-redesign time, peer learning communities, and fluency training — not one-day workshops — are what will move the needle. Funding for educating educators and reducing class size is notoriously last to exist, first to go.

Fourth, build a shared research-to-practice infrastructure. The academic literature on critical thinking and AI is exploding — a 2025 bibliometric review in a Scopus-indexed journal identified more than six hundred articles on the topic published since ChatGPT’s release — but almost none of it is reaching classroom teachers in a usable form. Translation institutions, the kind of work that organizations like the Reboot Foundation, the AI Education Project, and Stanford’s HAI do, need to be funded at the level of the problem.

Fifth, write the policies now, not after the next news cycle. Districts, universities, and education ministries that wait for a crisis will end up with reactive, restrictive policies that either ban tools students are already using or waive oversight altogether. The institutions that do this well are the ones that are defining, publicly and early, what human judgment is required for and what AI is invited to do — and building the training, assessment, and accountability structures to match.

❧

A closing note

Every technological shift in history has rearranged what human beings need to be able to do. Printing did it. Calculators did it. Search engines did it. I participated in promoting computers and software in classrooms around the world. In each case, the transition was disorienting, the predictions were wrong in both directions, and the skills that ultimately mattered were not always predictable. What is different this time is the speed, and the specific function AI is taking over: not computation, not memory, not navigation, not typing and editing, but reasoning itself, or a performance of it convincing enough to pass for reasoning most of the time.

Critical thinking has always been the premium human capacity. In the age of AI, it stops being a premium and becomes the floor. The students and citizens who can genuinely think — who can frame a question, weigh a claim, audit an assumption, imagine a dissent, notice their own uncertainty, and defend a conclusion — will flourish. The ones who cannot will be carried along by whatever the most confident-sounding output happens to say.

The choice between those two futures will be made, in large part, by the people who somehow find this essay and others like it, because they are in it, now. Curriculum leaders, philanthropists, school and university administrators, and policymakers have more agency in this moment than they often feel they have. Every dollar spent on teaching people to think with AI rather than defer to it, every assessment redesigned to measure the thinking rather than the artifact, every teacher supported through the transition, moves us toward the better version of this future.

That is the work. It is the most important education project of this decade, a very hard one — but it is one we know how to do.

— Alan Yates alancyates.com