I’ve just been reading that the government (in the form of the Education Select Committee) is recommending a return to the idea of performance-related pay for teachers. Now, this is interesting, to say the least – and more than a little political. Because, of course we all know how well a bonus-led culture worked in banking. So I’m going to sublimate my anger and approach this from a scientific point of view. Not just by looking at the data, but by treating it like a GCSE science problem in experimental design.
Background Research
You can find news reports at the Guardian and the Telegraph, among others. It might be an ineresting Politics/Media lesson to compare the reporting of this story in different publications, perhaps? The news stories I’ve seen completely fail to mention that this will presumably only apply to schools governed by national agreements, so academies and free schools may not even care. I’m still checking out research (the actual data that governments like to claim backs up their case) but this from the famous Ted Wragg is interesting.
Confounding Factors
It’s not that long ago that the government stopped collecting what we call ‘contextual value-added‘ data – where the students’ circumstances, social background etc are taken into account. So if we don’t know about all of these things, how can we account for them? An abvious example is that in some schools and areas it’s much more likely that students will access a tutor. And what about kids whose parents help them out, talk them through homework, share study techniques? Who’s responsible for any improvement?
Subjects overlap too. If I teach a student who’s doing badly in Maths, and this affects their Physics scores, who gets the blame? I’m imagining wars between Maths and Science, between English and Humanities, as teachers accuse each other of causing them problems. Not a pretty image. How are we supposed to work together when we’re also competing? Nobody wants to be at the bottom. Will teachers in one department stop sharing resources with each other?
Measuring the Dependant Variable
Is this going to be based solely on exam results? What about subjects which don’t do an external exam, such as PSHE? The equality or otherwise of subjects is always a huge issue, especially when different types of qualifications are considered. Will it apply to all key stages – what about teachers who only or mainly teach at Key Stage 3, for example?
What happens if one class does ‘well’ (although I’m still not sure how we’ll be able to tell) and another doesn’t? What about when a class is shared between two or more teachers? Or when a teacher is ill or on maternity leave? Do good A-level results matter more or less than good GCSEs? Should absolute scores or percentages matter? For example, if I have 14 students at A2 Physics, 7 of whom achieve an A grade, is this better or worse than, say, Spanish, who have 4 students and 3 A grades?
Bias
Many courses rely to at least some extent on teacher-assessed work. Will the existing pressure on teachers to give students the ‘best possible chance’ be increased? Should only externally-assessed work be used for the judgements? In theory this could lead to ethical teachers being penalised when those colleagues who are more ‘supportive’ – and yes, that was sarcastic – benefit personally from the better results of their students.
What about those students who happen to be taught by their Head of Year? How will their level of support vary compared to others? Or the students mentored by members of SMT, who so often seem to get extra chances or have the rules ‘stretched’ for them? Teaching the children of other staff members may suddenlt be a bigger perk than before.
And who chooses which teachers get the more promising students? It’s already true in many schools that timetabling causes problems when particular teachers are perceived to get ‘easier’ classes. Sometimes this is unavoidable – imagine two A-level Physics classes, who due to timetabling are split depending on whether they aso study Further Maths. I know which one I’d rather have.
Reproducibility
It’s so easy to forget with the rhetoric from politicians, but at a school level the sample sizes are small. Too small, really, for any such judgements to be made on a class by class basis. If we drew error bars on the results to account for the confounding factors – many of which we don’t know about, let alone have the ability to control – they would be huge. Yes, we can look at the effects of various interventions on students, and many of us are trying to use this data (see the fantastic work by Geoff Petty for example, the What Works Clearing House, and Dr Mark Evans’ Teachitso website). Linking research to educators working in the classroom is surprisingly difficult, though see #SciTeachJC for one such effort.
But the useful data comes from large studies, reviews of many classrooms and many teachers. If I have a class of twenty-five (chance would be a fine thing) then every child’s results make up 4% of the total. How many students in the average classroom will lose a relative during exam season? How many will have health problems? You don’t need many to affect the class results hugely, and these factors are unpredictable. Like decaying atoms, we can measure how many of these events will happen – probably with high accuracy – in any particular cohort. But in any one class it will vary hugely.
Resolution
Our results aren’t even very detailed. Grade boundaries change, and we can often break it down into more detail than to an A or a B. Will it matter if students meet a decimalised target, or does just the grade matter? How many subjects will we need to look at? If it’s just about meeting a boundary, those who get over it will be ignored even more than we’ve already seen with the wonderfully-named ‘C-chasing’ strategy.
Conclusion
Sadly, it seems to me that performance related pay fails the test according to what we teach our students. It seems a shame that the MPs haven’t done an ISA recently…