Class Size

Effect Size d= 0.21  (Hattie's Rank=132)

Hattie uses the following three meta-analyses:

AuthorsYearNo. studiesstudentsMean (d)CLEVariable
Glass & Smith197977520,8990.096%Class size
McGiverin et al1999100.3424%Class size
Goldstein,Yang, Omar, &Thompson2000929,4400.2014%Class size

Does Hattie Misrepresent the 3 studies?


1. Glass and Smith (1979) do a range of comparisons of class sizes of 40 versus 30 to classes of 1 versus 40. Hattie calculates an average by combing all class size reductions to get a low value of d = 0.09. Given that the class size reductions are totally different, the question must be asked what does this average mean?

If you look at this meta-analysis in more detail a totally different picture emerges, which is not represented accurately by using this average (Hattie only uses the average).

For example, a summary table from page 11, shows more detail (note: the average effect size seems to be around 0.25 not 0.09 as published by Hattie - Emeritus Professor Ivan Snook, (see below) also comments about this.




Then on page 15-



A key observation from the above graph is the difference between well and poorly controlled studies.

Smith and Glass conclude-
"A clear and strong relationship between class size and achievement has emerged... There is little doubt, that other things being equal, more is learned in smaller classes" (p15).

2. McGiverin, et al, (1989) state that, the lack of experimental control and diverse definitions of large and small are among the reasons cited for inconsistent findings regarding class size (p49).

In addition, they are critical of the Glass (1979) study for not using pragmatic class sizes. As a result, their study focused on second-year students with properly controlled studies using experimental and control groups (although not randomly assigned). They decided a more pragmatic definition of a large class size is about 26 and a small class size is about 19 (p49).

They introduce a caveat by quoting Berger (1981). "Focusing on class size alone is like trying to determine the optimal amount of butter in a recipe without knowing the nature of the other ingredients" (p49).

Whilst they get a reasonably high d = 0.34 they advise caution in the interpretation of this result (p54). Also, they make special mention of the confounding variables - the Hawthorne effect, novelty, and self- fulfilling prophecy.

3. Goldstein, et al (2000) state their aim: "The present paper focuses more on the methodology of meta-analyses than on the substantive issues of class size per se." For a more detailed discussion on class size, they recommend looking at their previous papers (p400).

Summary of results from page 401:

studyYearno. studentseffect size -dexperiment typesmall classnormal class
11251non random1530
17440.17
3656
36020.02
24256?16, 2330,37
43680.15
4450-0.07
4555-0.07
3278correlation1530
25420.32
3156
35550.27
457
44410.21
543
54130.44
663
63740.36
411127?1525+
25160.31
5257<1525
2550.39
61368?<2030+
1646-0.13
713712027
13500.39
2309
23130.23
892819?2030
925430.6
9 STAR12644randomised13-1722-25
114140.24
23112
214820.06
33353
313570.03
ave0.21
weighted ave0.23


A comparison of the studies shows different definitions for small and normal classes, e.g. study 2 defines 23 as a small class whereas in study 9 it is a normal class. So comparing the effect size is not comparing the same thing!

The authors comment about another problem we have seen throughout VL, "we have the additional problem that different achievement tests were used in each study and this will generally introduce further, unknown, variation" (p403).

From Goldstein, et al (2003)

A reduction in class size from 30 to 20 pupils resulted in an increase in attainment of approximately 0.35 standard deviations for the low attainers, 0.2 standard deviations for the middle attainers, and 0.15 standard deviations for the high attainers” (p17).





So once again, the detail of the study is lost when Hattie uses ONE average effect size d value to represent that study.

Hattie's Interpretation:


In his recent collaboration with Pearson (2015) - "What Works in Schools - the politics of distraction", he names class size as one of the major distractions. In previous presentations, he consistently labelled class size a "disaster" or as "going backwards" (Hattie's 2005 ACER presentation):




Yet, in what I think is the most comprehensive peer review of class size so far, Class Size Eastern and Western perspectives (2016), Hattie retreats from the above polemic and concedes,

"The evidence is reasonably convincing - reducing class size does enhance student achievement" (p113).

WHAT? Say that again Prof:


"The evidence is reasonably convincing - reducing class size does enhance student achievement" (p113).

Hattie then cleverly shifts the debate "Why is the (positive) effect so small?" (p105).

Although in the same book, the editor, Prof Blatchford, refocuses the debate on scrutiny of the evidence,

"One reason for the prevalence of the unimportant view are several highly influential reports which have set in motion a set of messages that have generated a life of their own, separate from the research evidence, and have led to a set of taken for granted assumptions about class size effects.

Given the important influence these [Hattie & others] reports seem to be having in government and regional education policies, they need to be carefully scrutinised in order to be sure about the claims that are made" (p93).

Hattie's Interpretation Is Used by Politicians for Public Policy:


The Australian Government in 2015, used Hattie to block significant funding to redress the socio-economic imbalance in Australian Schools - called the Gonski Review.

Professor Blatchford comments about this,

"When Christopher Pyne [the then Australian Education Minister] talked about prioritising teacher quality, rather than reducing class sizes, he set up a false and simplistic dichotomy" (p16, AEU News).

From New Zealand, a similar example, where Professor John O'Neill writes a significant letter to the NZ Minister of Education on the problem of using Hattie's research for class size policy.

In further commentary about Hattie, Professor O'Neill states,

"Much of the terminology is ambiguous and inconsistently used by politicians, officials and academic advisers. The propositions are not demonstrably true – indeed, there is evidence to suggest they are false in crucial respects. The conclusion is, at best, uncertain because it does not take into account confounding evidence that larger classes do adversely affect teaching, learning and student achievement" (p2).

I am concerned about the unwavering confidence that Hattie displays when he talks about class size, given the caution and reservation that the scholars of each of his 3 studies discuss as well as other reputable scholars around the world. Reservations due to the lack of quality studies, the inability to control variables, the major differences in how achievement is measuredmajor confounding variables and benchmark effect sizes.

The Largest Analysis and Peer Review of the Class Size Research (so far):


Class Size Eastern and Western perspectives (2016), edited by Prof Blatchford, et al. Note: Prof Blatchford has a dedicated website to class size research - http://www.classsizeresearch.org.uk

The editor's state, "there are in fact relatively few high-quality dedicated studies of class size and this is odd and unfortunate given the public profile of the class size debate and the need for firm evidence based on purposefully designed research fit for purpose" (p275).

"What often gets overlooked in debates about class size is that CSR is not in itself an educational initiative like other interventions with which it is often (and in a sense unfairly) compared, for example, reciprocal teaching, teaching metacognitive strategies, direct instruction and repeated reading programmes; it is just a reduction of the number of pupils in a classroom" (p276).

Prof Blatchford warns again about correlation studies, "Essentially the problem is the familiar one of mistaking correlation for causality. We cannot conclude that a relationship between class size and academic performance means that one is causally related to the other" (p94).

The editors conclude, "the chapters in this book are only a start and much more research is needed on ways in which class size is related to other classroom processes. This has implications for research methods: we need more systematic studies, e.g. which use systematic classroom observations, but also high-quality multi-method studies, in order to capture these less easily measured factors. 

There is some disagreement about which groups are involved but often studies find it is low attaining and disadvantaged students who benefit the most. Blatchford et al (2011) found evidence that smaller classes helped low attaining students at secondary level in terms of classroom engagement. Hattie (Chapter 7) develops the view that we might expect low attaining students to benefit from small classes in terms of developing self regulation strategies" (p278).

Blatchford concludes, "The aim is move beyond the rather tired debates about whether class size affects pupil performance and instead move things on by developing an integrative framework for better understanding the relationships between class size and teaching, with important practical benefits for education world wide" (p102).

Hattie's contribution to the book (Chapter 7):

Hattie appears to be an outlier in this book. Of the 17 scholars who have contributed to the book ONLY Hattie myopically uses the effect size statistic to fully interpret the research. All the others use contextual and detailed features of the research to reach the conclusion that class size is important and significant.

At least the weight of scholarship has caused Hattie to retreat from his polemic on reducing class size as 'a disaster' and 'going backwards' and he finally concedes, "The evidence is reasonably convincing - reducing class size does enhance student achievement" (p113).

But, Hattie cleverly reframes the issue to "Why is the (positive) effect so small?" (p105).

Given the significant amount of critique about Hattie's methodology - the lack of quality studies, the use of disparate measures of student achievement, the use of studies of university students or pre-school children, the use of correlation, the inconsistent definition of small and large class sizes, indiscriminate averaging, benchmark effect sizes, etc, etc. I was disappointed that Hattie did not address any of these issues. But rather focused on attacking Dr David Zyngier's meta-review, "Zyngier's review misses the elephant in the room" (p106). 

But if Zyngier misses the elephant in the room, then so do all the other 16 researchers contributing to the book. For example, in the following chapter (8) Finn & Shanahan, display what they believe to be significant findings (p124):



Hattie once again sidesteps the SIGNIFICANT issues raised by Zyngier (+ many others): e.g., the control of variables - the differing definition of large and small classes. Studies also differ on how to measure class size, some studies use a student/teacher ratio (STR) which includes many non-teaching staff like the principal, welfare staff,  library, etc. "Past research has too often conflated STR with class size" (p4). Blatchford, et al (2016), also comment on this STR problem, "they are not a valid measure of the number of pupils in a class at a given moment" (p95).

Hattie just re-states meta-analyses provide a reasonably robust estimate (oxymoron?) and myopically focuses on the effect size statistic. But he provides no defence for the validity issues. However, he concedes STR and class size are different, but he does not resolve the validity issue of using these disparate measures and just fobs off the argument by using a red herring - STR and Class size are related (p112) (but no evidence for this claim is provided). 

Given the importance of class size research, STR and Class size need to be MORE than just related. They need to be the SAME!!!!

Hattie includes a 4th study to his effect size average, Shin and Chung (2009) - effect size d = 0.20. But he conveniently does not inform the reader that this study re-analysed the same data (the Tennessee STAR study) as the previous meta-analyses that he used.

Ironically, Shin and Chung warn against creating an effect size from repeated use of the same data, "If a study has multiple effect sizes, the same sample can be repeatedly used. Repeated use of the same sample is, however, a violation of the independent assumption" (p14).

They also warn, "we found too many Tennessee STAR studies... We worry about the dependence issue" (p15).

It seems to me Hattie's strategy is to take the focus off the scrutiny of his evidence and re-direct our attention elsewhere - a strategy for politicians, NOT for researchers! Yet in the introduction to VL (2009) he welcomed refutation with the inference that he would address this.

Teacher Morale:

Blatchford et al, comment on the  associated issue of teacher morale and class size,

"Virtually all class size studies report that teacher morale is higher in small classes than in larger classes. The personal preference for small classes was demonstrated by STAR third-grade teachers interviewed at the end of the school year. Teachers were asked whether they would prefer a small class with 15 students or a $2,500 salary increase. Seventy percent of all teachers and 81 percent of those who had taught small classes chose the small class option over a salary increase" (p129).

PISA

Blatchford et al, challenge the statements of the head of PISA Andreas Schleicher,

"there was reference to ten myths of education, as expressed by Andreas Schleicher, one of which was the myth that smaller classes benefited academic performance. The editors of this book tend to side with Berliner and Glass (2014) who address what they see as the 50 myths and lies which threaten American public schools. Myth no. 17 in their list is the belief that reducing class size will not result in more learning" (p275).

Other Commentary-

The Australian Education Union has published a comprehensive analysis of the class size research. They summarise that reducing class size does seem to improve student outcomes. Also, they highlight the problems with Hattie's methodology:

"The critics have cited the methodological problem of synthesising a whole range of meta-studies each with their own series of primary studies. There is no quality control separating out the good research studies from the bad ones. The different assumptions, definitions, study conditions and methodologies used by these primary studies mean that Hattie’s meta-analysis of the meta-analyses is a homogenisation which may distort the evidence (comparing apples with oranges)" (p13).

"The 0.21 effect he claims for class size is an average so that some studies may have found a significantly higher effect than that. For example, ‘gold standard’ primary research studies (using randomised scientific methodology) such as the Tennessee STAR project recorded a range of effect sizes including some at 0.62, 0.64 and 0.66, clearly well above the ‘hinge-point’ and the same as most variables which Hattie regards as very important" (p14).

From Professor John O'Neill's AMAZING letter. O'Neill quotes from a detailed case/naturalistic study by Blatchford (2011), "Professor Blatchford makes the point that class size effects are ‘multiple’. For children at the beginning of schooling, there are significant potential gains in reading and maths in smaller classes. Children from ethnic minorities and children who start behind their peers benefit most. There is also a positive effect on behaviour, engagement and achievement, particularly for low achievers, where classes are smaller in the lower secondary school" (p10).

Leading researcher, Professor Dylan William states that the evidence is pretty clear that if you teach smaller classes you get better results. The problem is smaller classes cost a lot more (7min into full lecture).

Also, many scholars point out the irony in Hattie's view, that class size is a distraction - because the number of students in a class limits the ability of teachers to implement the kinds of changes that Hattie shows have the biggest effect, e.g., formative evaluation, micro teaching, behavior, feedback, teacher-student relationships, etc.

For example,  Dr David Zyngier in his meta-review - "The strongest hypothesis about why small classes work concerns students’ classroom behaviour. Evidence is mounting that students in small classes are more engaged in learning activities, and exhibit less disruptive behaviour" (p17).

Each of these studies also discusses their limitations. In particular, Goldstein, et al (2000) emphasise the issue, that has emerged for all of Hattie's synthesis; "... we have the additional problem that different achievement tests were used in each study, and this will generally introduce further, unknown, variation" (p403).

Goldstein, et al (2003) go into detail about the problems of comparing correlation studies with random controlled experiments; “… correlational studies that ... examined relationships between class size and children’s achievements at one point in time, are difficult to interpret because of uncertainties over whether other factors (e.g., non-random allocation of pupils to classes) might confound the results" (p3).

Goldstein, et al (1998) point out another major confounding variable: "There is a tendency for schools to allocate lower achieving children to be in smaller classes. This bias means a considerable number of large cross-sectional studies (correlational) need to be ignored due to validity requirements" (p256).

Robert Slavin, "Best-Evidence Synthesis: An Alternative to Meta-Analytic and Traditional Reviews" (1986) also discusses this issue; a “best evidence synthesis” of any education policy should encourage decision makers to favor results from studies with high internal and external validity—that is, randomized field trials involving large numbers of students, schools, and districts. Note: Glass and Smith graph the difference between high and low-quality studies below.

Slavin also discusses the major issue of the disparate ways in which achievement is measured; one achievement test consisted of rallying a tennis ball against a wall as many times as possible in 30 seconds. Other studies used a total treatment time of only 30 minutes. Other studies used only post-secondary students (p7).

Dr David Zyngier, has published an excellent meta-review on class size - "Class size and academic results, with a focus on children from culturally, linguistically and economically disenfranchised communities."

"Noticeably, of the papers included in this review, only three authors supported the notion that smaller class sizes did not produce better outcomes to justify the expenditure" (p3).

The highly selective nature of the research supporting current policy advice to both state and federal ministers of education in Australia is based on flawed research. The class size debate should now be more about weighing up the cost-benefit of class size reductions, and how best to achieve the desired outcomes of improved academic achievement for all children, regardless of their background. Further analysis of the cost-benefit of targeted CSR is therefore essential" (p16).

Zyngier concludes: "Findings suggest that smaller class sizes in the first four years of school can have an important and lasting impact on student achievement, especially for children from culturally, linguistically and economically disenfranchised communities" (p1).

He also states, "Recognised in the education research community as the most reliable and valid research on the impact of class size reductions at that time, the Tennessee STAR project was a large series of randomised studies, followed up in Wisconsin by the SAGE project. After four years, it was clear that smaller classes did produce substantial improvement in early learning and cognitive studies, and that the effect of small class size on the achievement of minority children was initially about double that observed for majority children" (p7).

Emeritus Professor Ivan Snook, et al, in their peer review of Hattie, also comment in detail about class size. They also talk about the STAR study reporting effect sizes did reach 0.66.

They conclude: "The point of mentioning these studies is not to “prove” that Hattie is 'wrong' but to indicate that drawing policy conclusions about the unimportance of class size would be premature and possibly very damaging to the education of children particularly, young children and lower ability children. A much wider and in depth debate is needed" (p10).

Dr Neil Hooley, in his review of Hattie - "Making judgments about John Hattie's effect size" talks about the complexity of classrooms and the difficulty of controlling variables, on the issue of class size he says, "Under these circumstances, the measure of effect size is highly dubious" (p44).

Dan Haesler has a detailed look at class size and other issues.

Kelvin Smythe provides a detailed review of Hattie's research on Class size.