Self Report Grades

The highest ranked influence is Self-report grades with d=1.44. (Hattie's Rank=1)

Note: Hattie has since added 'student expectation' to this influence (even though none of the meta-analyses measured this).

Hattie used these 5 meta-analyses to get an average d = 1.44. They are mostly correlation studies and not true experiments as discussed in calculating effect sizes.

AuthorsYearNo. studiesstudentsMean (d)CLEVariable
Mabe & West198235135650.9365%self evaluation
Falchikov & Boud19895753320.4733%self assessment College
Ross1998111.63115%second language
Falchikov & Goldfinch20004842711.91135%self assessment College
Kuncel, Crede & Thomas200529562653.1219%GPA
Kuncel, Crede & Thomas2005290.642%self vs recorded grades

Falchikov & Goldfinch (2000) measured peer-assessment NOT self-report grades. Kuncel (2005) measured the student’s memory of their GPA score from a year or so previously; which is a measure of memory or honesty; not of students' predicting their future scores.

 The authors state their aim:

Kuncel et al (2005) “Since it is often difficult to get results transcripts of student PREVIOUS GPA’s from High School or College, the aim of this study is to see whether self-reported grades can be used as a substitute. This obviously has time-saving administration advantages” (p64).

Falchikov & Goldfinch (2000) “We conceive of the present study as an investigation of the validity of peer marking” (p288).

Mabe and West (1982 ) “The intent of this review is to develop general conclusions about the validity of self-evaluation of ability” (p281). Note: they measured over 20 different categories of achievement from scholastic, athletic, managerial to practical skills.

These studies are NOT measuring self-report grades as a treatment, influence or teaching strategy, as Hattie suggests they are.

Ironically, Kuncel et al (2005), warn about misinterpretation of their work; “Precise coding and combination of data are critical for the production of a meta-analysis. If data examining fundamentally different samples or variables are unintentionally combined, it may jeopardise the findings. The result would be a mixing of potentially different studies that could yield an uninterpretable blend. Stated simply, this is the old debate about comparing oranges versus apples” (p70).

Falchikov and Boud (1989) also warn about misinterpretation; “Variables are often defined inadequately, and accounts frequently require excessive inference on the part of the reader” (p396).

They conclude, “the greater the effect size, the less the self-marker ratings resemble those of staff markers” (p417).

So high effect sizes, in this case, indicates students are over rating themselves, not that self-rating improves performance, as Hattie infers.

I contacted Professor Kuncel to make sure I interpreted his study correctly, he replied that the conclusion of the study was: "Generally people exaggerate their accomplishments."

There is absolutely no claim by any of the researchers that self-report grades are in any way an influence, treatment, or teaching strategy. Hattie clearly misinterprets these studies and makes an excessive inference in claiming d = 1.44. Although, Hattie does warn us about his claims, “I must emphasise these are clearly speculative” (p 4).

Another major issue is the Ross (1998) study uses English Second Language students, a very small or abnormal subset of the total student population. So inferences should NOT be made about the general student population.

The conclusion most authors make from these meta-analyses is best stated by Falchikov and Boud (1989) “A close correspondence between individual teacher and student’s marks suggests that the student has a good sense of his or her absolute level of performance”(p426).

Other academics who are critical about Hattie misinterpreting these studies on self-report grades are:

Professor Ivo Arnold
"The paper [Kuncel (2005)] should not have been included in the analysis. This example does raise questions regarding the remaining average effect sizes" (p220).

Dr Kristen Dicerbo -
"The studies that produced the 1.44 effect size did not study self-report grades as a teaching technique. They looked at the correlation of self-report to actual grades, often in the context of whether self-report could be substituted for other kinds of assessment. None of them studied the effect of changing those self-reports. As we all know, correlation does not imply causation. This research does not imply that self-expectations cause grades."


Click here for other blogs commenting on this influence.