Educational psychology, like all sciences, depends on its ability to learn from its own missteps. Few theorists have embraced this more openly and fully than John Sweller, Emeritus Professor of Educational Psychology on the University of New South Wales in Australia. In his 2023 article, The Development of Cognitive Load Theory: Replication Crises and Incorporation of Other Theories Can Lead to Theory Expansion, Sweller does something rare: he embraces the very failures that he had encountered with cognitive load theory and which would, in other contexts, cast doubt on a theory’s validity. Instead, he shows how they have served as engines of refinement and sources of conceptual depth.
What emerges is not just a stronger Cognitive Load Theory (CLT), but a masterclass in how educational theory should evolve through epistemic iteration, integration of insights from others, and relentless attention to the basis, human cognitive architecture.
From Replication Crisis to Cognitive Precision
The replication crisis in psychology has prompted much hand-wringing amongst scientists and research institutes. Failed attempts to reproduce findings have exposed weaknesses in research methodology, shotgun empiricism, overreliance on p-values and p-hacking, post-hoc explanations, and the allure of “novel” effects. Yet Sweller argues that many replication failures are not methodological errors but rather conceptual ones.
CLT, he contends, has matured precisely because it failed to replicate in different situations. A worked example effect that held for algebra disappeared in geometry and physics. Modality effects reversed under certain conditions. Split-attention effects proved more elusive than expected. Each failure, instead of invalidating the theory, pointed to new variables that needed to be accounted for. The result was a theory that didn’t collapse under contradiction—but expanded to accommodate it.
Where others saw crisis, Sweller saw a question: What variable are we missing? And more often than not, the answer was hidden in the architecture of human cognition.
Cognitive Load: More Than Just a Limitation
At its core, CLT rests on one deceptively simple premise: human working memory is limited—in both capacity and duration when dealing with novel information but unlimited when dealing with familiar information from long-term memory. But rather than treating this as a constraint to be overcome, Sweller treats it as the bedrock upon which effective instruction must be built. In the latest version of the theory, CLT distinguishes between:
Intrinsic cognitive load – determined by the complexity of the material, or more precisely, by element interactivity; the number of elements in a learning task and how they interact with each other.
Extraneous cognitive load – imposed by any instructional design; poor instructional design adds much extraneous load by increasing element interactivity, good instructional design adds little.
Germane load – once thought to be a third kind of cognitive load, it is now understood as simply the working memory effort allocated to managing intrinsic load.
These distinctions matter. A lesson or learning task that overwhelms working memory may result not in failed performance, but in failed learning. CLT emphasises that success on a task may be misleading—a momentary performance rather than a lasting cognitive gain.
Element Interactivity and the Power of Expertise
One of CLT’s most powerful refinements has been the concept of element interactivity—the idea that some learning tasks require simultaneous attention to multiple, interacting elements. Crucially, what counts as a single element depends on the learner’s prior knowledge. For a novice, an algebraic equation may consist of many interrelated parts. For an expert, it’s a single chunk transferred from long-term to working memory.
This insight explains why instructional strategies like worked examples are so effective for beginners—but may backfire for experts, who find them redundant or even constraining. This expertise reversal effect shows that CLT is not a fixed formula but a dynamic framework: what works today may fail tomorrow, depending on how much the learner has already internalised.
Such findings reveal an uncomfortable truth: the variability of instructional effectiveness is not noise to be filtered out—it is the very signal that tells us how learning works.
The Paradox of Struggle: When Learning Feels Harder, It Often Works Better
CLT also sheds light on a counterintuitive finding: conditions that induce more cognitive effort—and even more errors—during instruction, often produce better long-term retention. This is closely aligned with Robert and Elizabeth Bjork’s concept of desirable difficulties, but CLT grounds it in the mechanics of memory: effortful processing leads to stronger encoding. A note here: Desirable difficulties work fine for low element interactivity information but are disastrous for high element interactivity information. Making problem solving harder for someone who is struggling to solve a high element interactivity problem doesn't work!
What looks like failure (struggle, slowness, mistake-making) may be exactly what cognitive architecture needs to build robust schemas. Sweller’s own early studies revealed that learners could solve problems using the correct rules (which they often reached through trial-and-error) without ever learning them, if the problem-solving process overloaded their working memory. This is an example of what John Sweller calls an UNdesirable difficulty. The easy way is to show the problem solver the solution and how to get there. Ironically, asking students to solve problems can interfere with their ability to learn from them or as Sweller, Richard Clark , and I wrote, solving problems isn’t the best way to learn to solve problems.
Hence the importance of explicit instruction and worked examples: they reduce unnecessary load, making room for genuine learning to occur. But again, context matters. As expertise increases, the balance tips, and more open-ended challenges may be more appropriate.
But it wasn’t only failures that led to the expansion and strengthening of CLT. Sweller also ‘borrowed’ from other theories along the way.
A Biological Theory of Education
Perhaps the most profound expansion of CLT in recent years comes not from educational psychology, but from evolutionary theory. Sweller draws on David Geary’s distinction between:
Biologically primary knowledge – which humans are evolutionarily wired to acquire (e.g., spoken language, facial recognition).
Biologically secondary knowledge – which must be explicitly taught (e.g., algebra, reading, scientific reasoning).
This distinction dissolves a longstanding myth in progressive education—that discovery is always the most “natural” and therefore the most effective form of learning. It might be for biologically primary knowledge. But for biologically secondary knowledge—the bread and butter of school curricula and actually its raison d’être —it often fails.
We evolved to speak without instruction. It was important for our survival as species and without this ability a child would have died without being able to pass their genes on to a next generation. Those who could better communicate (with their parents, other members of their group) had a greater chance of survival and of passing on their genes for this to future generations. We did not evolve to be able to read, let alone derive the Pythagorean theorem or learn to program Python. These skills have no function in our survival as species (more crudely stated, you don’t need to be literate to procreate; see our world today!) The acquisition of these skills requires instructional scaffolding that respects the limits of working memory and leverages the power of long-term memory.
This evolutionary framing helps CLT reject the false dichotomy between “natural” learning and formal education. It shows that schools exist because biologically secondary knowledge doesn’t come for free. It must be built—and that building requires careful instructional design.
Instruction as an Extension of Nature
Sweller takes this biological analogy further. He draws a striking parallel between evolution by natural selection and human learning:
In evolution, random variation and natural selection gradually shape adaptive traits.
In human cognition, problem solving and direct instruction gradually build usable knowledge.
Just as genetic information is transmitted biologically, cultural and instructional information is transmitted socially.
Both are information processing systems. Both depend on the storage, transfer, and refinement of information. And both generate complexity not from planning, but from iteration. This analogy lends CLT a kind of ontological weight: it’s not just a theory about how we learn—it’s a theory about how intelligent systems evolve.
A Model for Theory Development
Sweller’s retrospective is, at its heart, a call to the field: educational psychology must become comfortable with theoretical failure. Not in the sense of embracing chaos or rejecting rigor—but in recognising that every failed prediction is a potential refinement waiting to be discovered.
CLT has grown not through grand synthesis, but through recursive repair. Failed replications didn’t undermine it; they honed it. Competing theories weren’t rejected out of hand; they were absorbed when compatible (e.g., Geary’s evolutionary lens, or insights from memory science). Sweller’s work exemplifies epistemic humility: the willingness to revise, refine, and reframe in the face of contradictory data. And this process should not be the exception—it should be the expectation.
Beyond Theory—Towards Wisdom
What makes Sweller’s 2023 article so remarkable is not just the theoretical advances it chronicles, but the intellectual posture it models. It treats failure as an opportunity, integration as a necessity, and cognitive architecture as the ultimate constraint and enabler of learning.
For educators, the implications are clear: effective teaching is not about showmanship or trend-chasing. It is about aligning instruction with the reality of how our brains work. And that reality is both fragile and powerful. Working memory is tiny. Long-term memory is vast. The bridge between them is instruction—deliberate, well-designed, and sensitive to the learner’s current state.
In the end, Cognitive Load Theory is not merely a set of instructional heuristics or principles. It’s a philosophy of instruction grounded in the biology of thought. And in a world awash with ephemeral fads, that is something worth cherishing.
Sweller, J. (2023). The development of cognitive load theory: Replication crises and incorporation of other theories can lead to theory expansion. Educational Psychology Review, 35(95). https://doi.org/10.1007/s10648-023-09817-2
I appreciate this perspective on theory evolution - but most importantly, the reason why - a strict focus on the “best practices” that then get applied (or mandated) for broad segments of students is often a misapplication that doesn’t require a rejection, but a revision - more attempts - and not strict replications, but variations based on the same theoretical principle.
The incredible number of variables in education are what make hard science in education so hard, and why many studies who attempt to include those many variables end up rejected for sloppiness. In order to produce the “robust” and replicable research in education, we have to control for (or ignore) so many applicable variables to isolate the impact of a certain instructional approach. Then, when a replicable finding gets rolled out and applied as a strategy for learning, there are inconsistencies with the original “data” as the replication attempts encounter the variables that has been controlled or unaccounted for.
But if we don’t judiciously attempt instructional moves based on promising research that DO produce results in certain contexts under certain conditions, we risk missing a better revision that may account for that exposed variance or gap.
Totally agree, failure is the way we keep improving teaching and learning, but also helping school-level instructional folks see themselves as researchers of their practice, too - providing a feedback loop to researchers interested in developing theories that are flexible enough to be used to make better learning happen. A teacher-as-researcher stance also threatens systems of compliance, so the relationship between those developing theories that add value and those testing them out in the wild (if they are even able to) is too far apart for better data to inform either.
Sweller’s response to theoretical failure sounds very much like the scientific method - observe - analyse - theory - test - observe - analyse- adjust theory - test …
Or am I missing something ?