Monday, 18 January 2016

An investigation of the evidence John Hattie presents in Visible Learning

At the 2005 ACER conference (p5) Hattie said,
'We must contest the evidence – as that is the basis of a common understanding of progression.'
Then in Visible Learning [VL] he quotes Karl Popper (p4)
'Those amongst us unwilling to expose their ideas to the hazard of refutation do not take part in the scientific game.'
Tom Bennett, the founder of researchEd,  wrote an influential paper The School Research Lead, where he states (p9&10),
'There exists a good deal of poor, misleading or simply deceptive research in the ecosystem of school debate...
Where research contradicts the prevailing experiential wisdom of the practitioner, that needs to be accounted for, to the detriment of neither but for the ultimate benefit of the student or educator.'
Prof Adrian Simpson's detailed analysis of the calculation of effect sizes, The misdirection of public policy: comparing and combining standardised effect sizes states (p451), 
'The numerical summaries used to develop the toolkit (or the alternative ‘barometer of influences’: Hattie 2009) are not a measure of educational impact because larger numbers produced from this process are not indicative of larger educational impact. Instead, areas which rank highly in Marzano (1998), Hattie (2009) and Higgins et al. (2013) are those in which researchers can design more sensitive experiments. 
As such, using these ranked meta-meta-analyses to drive educational policy is misguided.'
Prof Dylan Wiliam writes in 'Getting educational research right', 
'Those ... who focus on ensuring that practice is based on ‘what works’, will find that no educational initiative can be implemented in the same way in every school. Adjustments need to be made, but they need to be made by people who understand the research so that the initiatives do not suffer what Stanford education professor Ed Haertel called “lethal mutations”. Teachers, leaders and policymakers all need to be critical consumers of research.'

The Aim of this Blog:

is to be a critical consumer of research and contest the evidence that Hattie presents in his 2009 book Visible Learning [VL] by using independent peer reviews and by analysing the studies that Hattie used.

The blog is broken up into different pages (menu on the right) designed so you can easily go to what interests you most

Firstly a critique of Hattie's methodology - Effect Size, Student Achievement, CLE and other errors, A Year's Progress and Validity/Reliability. 

Then an analysis of particular influences. I would recommend starting with what was his highest ranked influence 'Self Report Grades' and then look at the controversial 'Class Size'.

In his interview with Hanne Knudsen (2017) John Hattie: I’m a statistician, I’m not a theoretician Hattie states,
'What I find fascinating is that since I first published this back in the 1990s, no one has come up with a better explanation for the data... 
I am updating the meta-analysis all the time; I am up to 1400 now. I do that because I want to be the first to discover the error, the mistake' (p7). 
I find these comments hard to reconcile since, as you will see, there are many scholars who have published peer reviews identifying significant problems in Hattie's work and have called into question his entire model.

I also recommend teachers look at the section - A Years Progress? It analyses what I think is Hattie's most dangerous idea that an effect size of 0.4 = 1 year's student progress.

Contributions are welcome. Many of the controversial influences only have 1-3 meta-analyses to read. I can provide you copies of most of the research used.


The peer reviews have documented significant issues with Hattie's work ranging from flawed methodology, calculation errors, misrepresentation to questionable inference and interpretation.

Simpson (2017) and Bergeron (2017) detail methodological errors showing the effect size for an experiment can differ enormously (0 to infinity) depending on how it is calculated. So comparing effect sizes across different studies is meaningless!

Misrepresentation, calculation errors, questionable inference, and interpretation occur in a variety of ways. The most serious of is Hattie's use of studies that do not measure what he claims they do. This occurs in 3 ways:

Firstly, many studies do not measure achievement but something else, e.g., IQ, hyperactivity, behavior, and engagement. See Student Achievement for more details.

Secondly, most studies do not compare groups of students that control for the particular influence that Hattie claims. There is a litany of examples, e.g., self-report grades, reducing disruptive behavior, welfare, diet, Teacher Training, Mentoring, etc.

Thirdly, Hattie used ONE average to represent each meta-analysis, yet each meta-analysis represented from 4 up to 4000 studies (Marzano). 

But, apart from giving equal weight to each average, the big question is, 

what does ONE average mean? (no pun intended)

The clear example is Class Size:

Gene Glass and Mary Lee Smith (1979), 1 of the 3 meta-analyses that Hattie uses for class size, summarise their data in a graph and table:

The trend and the difference between good and poor quality research are clearly displayed. Gene Glass and Mary Lee Smith conclude,
'A clear and strong relationship between class size and achievement has emerged... There is little doubt, that other things being equal, more is learned in smaller classes' (p15).

Hattie calculated one average from the above table (3rd Column) of d = 0.09 and used this to represent the whole meta-analysis (but the average seems to be, d = 0.25).

In his collaboration with Pearson (2015) - "What Works in Schools - the politics of distraction", he names class size as one of the major distractions.

In previous presentations, he consistently labelled class size a disaster or as going backwards (2005 ACER presentation).

Yet it is ironic that the author of the class size study, Professor Gene Glass, who also invented the meta-analysis methodology, wrote a book contradicting Hattie, '50 Myths and Lies That Threaten America's Public Schools: The Real Crisis in Education'.

In Myth #17: Class size does not matter; reducing class sizes will not result in more learning, Professor Glass says,
'Fiscal conservatives contend, in the face of overwhelming evidence to the contrary, that students learn as well in large classes as in small.'
I contacted Prof Glass to ensure I interpreted his study correctly, he kindly replied,
'Averaging class size reduction effects over a range of reductions makes no sense to me. 
It's the curve that counts. 
Reductions from 40 to 30 bring about negligible achievement effects. From 20 to 10 is a different story.  
But Teacher Workload and its relationship to class size is what counts in my book.'
Bergeron (2017) reiterates,
'Hattie computes averages that do not make any sense.'
The next major problem is moderating variables

Prof Dylan Wiliam casts significant doubt on Hattie's entire model by arguing that the age of the students and the time over which each study runs is an important component contributing to the effect size. 

Professor Dylan Wiliam summarises, 
'the effect sizes proposed by Hattie are, at least in the context of schooling, just plain wrong. Anyone who thinks they can generate an effect size on student learning in secondary schools above 0.5 is talking nonsense.'
The massive data collected to construct the United States Department of Education effect size benchmarks support Prof Wiliam's contention.

These show a huge variation in effect sizes from younger to older students. Which demonstrates that age is a HUGE moderating variable since, in order to compare effect sizes, studies need to control for the age of the students and the time over which the study ran. Otherwise, differences in effect size can be due to the age of the students measured!

Blatchford et al (2016, p96) state that Hattie's comparing of effect sizes, 
'is not really a fair test.'
Wecker et al (2016, p35)
'the methodological claims arising from Hattie's approach, and the overall appropriateness of this approach suggest a fairly clear conclusion: a large proportion of the findings are subject to reasonable doubt.'
Prof Pierre-Jérôme Bergeron
'When taking the necessary in-depth look at Visible Learning with the eye of an expert, we find not a mighty castle but a fragile house of cards that quickly falls apart...

To believe Hattie is to have a blind spot in one’s critical thinking when assessing scientific rigour. To promote his work is to unfortunately fall into the promotion of pseudoscience. Finally, to persist in defending Hattie after becoming aware of the serious critique of his methodology constitutes willful blindness.'
Dr Neil Hooley, in his review of Hattie - talks about the complexity of classrooms and the difficulty of controlling variables, 
'Under these circumstances, the measure of effect size is highly dubious' (p44).
Dr. Mandy Lupton on Problem Based Learning,
'The studies have different effect sizes for different contexts and different levels of schooling, thus averaging these into one metric is meaningless.'

Why has Hattie become so popular?

In his excellent analysis 'School Leadership and the cult of the guru: the neo-Taylorism of Hattie', Professor Scott Eacott says,
'Hattie’s work has provided school leaders with data that appeal to their administrative pursuits' (p3). 
'The uncritical acceptance of his work as the definitive word on what works in schooling, particularly by large professional associations such as ACEL, is highly problematic' (p11).

The Rise of the Policy Entrepreneur:

Science begins with skepticism, however, in the hierarchical leadership structures of Educational Institutions skeptical teachers are not valued, although ironically, the skeptical skills of questioning and analysis are valued in students.  This paves the way for the many 'snake oil' remedies and the rise of policy entrepreneurs who 'shape and benefit from school reform discourses'.

Professor John O'Neill in analysing Hattie's influence on New Zealand Education Policy describes the process well:
'public policy discourse becomes problematic when the terms used are ambiguous, unclear or vague' (p1). 
[The] 'discourse seeks to portray the public sector as ‘ineffective, unresponsive, sloppy, risk-averse and innovation-resistant’ yet at the same time it promotes celebration of public sector 'heroes' of reform and new kinds of public sector 'excellence.
Relatedly, Mintrom (2000) has written persuasively in the American context, of the way in which ‘policy entrepreneurs’ position themselves politically to champion, shape and benefit from school reform discourses' (p2).
Hattie's recent public presentation in the TV documentary 'Revolution School' confirms Professor O'Neill's analysis. Dan Haesler reports Hattie's remedy cost the school around $60,000.

Professor Ewald Terhardt (2011, p434)
'A part of the criticism on Hattie condemns his close links to the New Zealand Government and is suspicious of his own economic interests in the spread of his assessment and training programme (asTTle).'

We need to move from evidence to QUALITY of evidence:

There must now be at least some hesitation in accepting Hattie's work as the definitive statement on Teaching.

Beng Huat See, in her paper, ‘Evaluating the evidence in evidence-based policy and practice: Examples from systematic reviews of literature', suggests the direction where educational research must now go,
'This paper evaluates the quality of evidence behind some well-known education programmes ... It shows that much of the evidence is weak, and fundamental flaws in research are not uncommon. This is a serious problem if teaching practices and important policy decisions are made based on such flawed evidence.

Lives may be damaged and opportunities missed.

... funders of research and research bodies need to insist on quality research and fund only those that meet the minimum quality criteria.'
The debate must now shift from Evidence to Quality of Evidence.

The US Dept of Education has done this and has developed clearly defined quality criteria in their What Works Clearing House.

Most of the meta-analyses that Hattie used would NOT satisfy these quality criteria, see here.

A Teacher's Lament:

Gabbie Stroud resigned from her teaching position and wrote:
'Teaching – good teaching - is both a science and an art. Yet in Australia today [it]… is considered something purely technical and methodical that can be rationalised and weighed.

But quality teaching isn't borne of tiered 'professional standards'. It cannot be reduced to a formula or discrete parts. It cannot be compartmentalised into boxes and 'checked off'. Good teaching comes from professionals who are valued. It comes from teachers who know their students, who build relationships, who meet learners at their point of need and who recognise that there's nothing standard about the journey of learning. We cannot forget the art of teaching – without it, schools become factories, students become products and teachers: nothing more than machinery.'
Whilst it may be simpler and easier to see teaching as a set of discreet influences, the evidence shows that these influences interact in ways in which no-one, as yet, can quantify. It is the combining of influences in a complex way that defines the 'art' of teaching.