Measurable Outcomes

Following a conversation on twitter about the phonics screening test administered in primary school, I have a few thoughts about how it’s relevant to secondary science. First, a little context – especially for colleagues who have only the vaguest idea of what I’m talking about. I should point out that all I know about synthetic phonics comes from glancing at materials online and helping my own kids with reading.

Synthetic Phonics and the Screening Check

This is an approach to teaching reading which relies on breaking words down into parts. These parts and how they are pronounced follow rules; admittedly in English it’s probably less regular than many other languages! But the rules are useful enough to be a good stepping stone. So far, so good – that’s true of so many models I’m familiar with from the secondary science classroom.

The phonics screen is intended, on the face of it, to check if individual students are able to correctly follow these rules with a sequence of words. To ensure they are relying on the process, not their recall of familiar words, nonsense words are included. There are arguments that some students may try to ‘correct’ those to approximate something they recognise – the same way as I automatically read ‘int eh’ as ‘in the’ because I know it’s one of my characteristic typing mistakes. I’m staying away from those discussions – out of my area of competence! I’m more interested in the results.

Unusual Results

We’d expect most attributes to follow a predictable pattern over a population. Think about height in humans, or hair colour. There are many possibilities but some are more common than others. If the distribution isn’t smooth – and I’m sure there are many more scientific ways to describe it, but I’m using student language because of familiarity – then any thresholds are interesting by definition. They tell us, something interesting is happening here.

The most exciting phrase to hear in science, the one that heralds new discoveries, is not “Eureka!” but “That’s funny …”

Possibly Isaac Asimov. Or possibly not.

It turns out that with the phonics screen, there is indeed a threshold. And that threshold just so happens to be at the nominal ‘pass mark’. Funny coincidence, huh?

The esteemed Dorothy Bishop, better known to me and many others as @deevybee, has written about this several times. A very useful post from 2012 sums up the issue. I recommend you read that properly – and the follow-up in 2013, which showed the issue continued to be of concern – but I’ve summarised my own opinion below.

phonics plot 2013
D Bishop, used with permission.

Some kids were being given a score of 32 – just passing – than should have been. We can speculate on the reasons for this, but a few leading candidates are fairly obvious:

  • teachers don’t want pupils who they ‘know’ are generally good with phonics to fail by one mark on a bad day.
  • teachers ‘pre-test’ students and give extra support to those pupils who are just below the threshold – like C/D revision clubs at GCSE.
  • teachers know that the class results may have an impact on them or the school.

This last one is the issue I want to focus on. If the class or school results are used in any kind of judgment or comparison, inside or outside the school, then it is only sensible to recognise that human nature should be considered. And the pass rate is important. It might be factor when it comes time for internal roles. It might be relevant to performance management discussions and/or pay progression. (All 1% of it.)

“The teaching of phonics (letters and the sounds they make) has improved since the last inspection and, as a result, pupils’ achievement in the end of Year 1 phonics screening check has gradually risen.”

From an Ofsted report

Would the inspector in that case have been confident that the teaching of phonics had improved if the scores had not risen?

Assessment vs Accountability

The conclusion here is obvious, I think. Most of the assessment we do in school is intended to be used in two ways; formatively or summatively. We want to know what kids know so we can provide the right support for them to take the next step. And we want to know where that kid is, compared to some external standard or their peers.

Both of those have their place, of course. Effectively, we can think of these as tools for diagnosis. In some cases, literally that; I had a student whose written work varied greatly depending on where they sat. His writing was good, but words were spelt phonetically (or fonetically) if he was sat anywhere than the first two rows. It turned out he needed glasses for short-sightedness. The phonics screen is or was intended to flag up those students who might need extra support; further testing would then, I assume, suggest the reason for their difficulty and suggested routes for improvement.

If the scores are also being used as an accountability measure, then there is a pressure on teachers to minimise failure among their students. (This is not just seen in teaching; an example I’m familiar with is ambulance response times which I first read about in Dilnot and Blastland’s The Tiger That Isn’t, but issues have continued eg this from the Independent) Ideally, this would mean ensuring a high level of teaching and so high scores. But if a child has an unrecognised problem, it might not matter how well we teach them; they’re still going to struggle. It is only by the results telling us that – and in some cases, telling the parents reluctant to believe it – that we can help them find individual tactics which help.

And so teachers, reacting in a human way, sabotage the diagnosis of their students so as not to risk problems with accountability. Every time a HoD puts on revision classes, every time students were put in for resits because they were below a boundary, every time an ISA graph was handed back to a student with a post-it suggesting a ‘change’, every time their PSA mysteriously changed from an okay 4 to a full-marks 6, we did this. We may also have wanted the best for ‘our’ kids, even if they didn’t believe it! But think back to when league tables changed so BTecs weren’t accepted any more. Did the kids keep doing them or did it all change overnight?

And was that change for the kids?

Any testing which is high-stakes invites participants to try to influence results. It’s worth remembering that GCSE results are not just high-stakes for the students; they make a big difference to us as teachers, too! We are not neutral in this. We sometimes need to remember that.


With thanks to @oldandrewuk, @deevybee and @tom_hartley for the twitter discussion which informed and inspired this post. All arguments are mine, not theirs.

Advertisements

References and Trust

A while back I had an interesting Twitter discussion about the problems with assessment in education, and how different approaches might be useful. The others involved (@richardtibbles and @informededu) were much more organised than I am so have long since moved on, after blogging about it. David suggested this concept originally at his blog and here is Richard’s much prompter response. Belatedly, however, I’ve typed up my own somewhat confused viewpoint. It’s more philosophical than practical…

What We Already Do

I like talking about ideas. Something that’s really important to me is getting my facts straight, and I’m a firm believer in checking what I’m saying and correcting myself when needed, which is fairly often. For example, I expressed surprise to a student who told me that he could download Wikipedia to his phone and access it offline. This seemed unlikely to me, but he was absolutely right. (Let’s take a moment to consider that today’s mobile phones usually have enough memory for millions of pages of text information). So I apologised to him in front of the class, and put the link on my page on the VLE.

What this was leading to is that to check an idea, I look for the facts. I don’t think much of personal authority – that will be the science background and general stubbornness – but I appreciate the value of a personal reputation. And so I always want to check the references. In lessons I’ll do this by regularly including links on my teacher’s page, even though I know relatively few kids will bother to use the VLE to that extent. Online, I love having the ability to include links on my blog posts and more recently in my tweets. Whenever I link to a website, news story or blog I am referring to them. In a way, I am demonstrating that I trust them.

When we suggest other people who might be interesting as part of ‘Follow Friday’ by tagging their IDs with #ff, we are giving them a reference. We possibly, almost certainly in my case, haven’t met them ‘in real life’. When we retweet someone’s idea, message or link we are also vouching for them, or at least the idea. Our commentary shows why we are doing so, which can be for both good or bad reasons! I suspect I’m not the only one who looks at those individuals a person follows, before I follow them in turn. We are trying to get a feel for how our peers judge a person. (If you’ve no idea what I’m on about because you’re not on Twitter, check out this post, one of many suggesting why it may be useful for educators.)

Google is based on the idea that the ‘worth’ of a website can be determined by how many others link to it. Whether we agree or disagree with the result, it can hardly be denied that it has been an effective approach.The links I make in blog posts are also references, although they lack academic rigour. The same could be said of my blogroll in a general way. I don’t read everything written on these blogs, but I’m saying by including them that “these people are worth a look.”

Blurbs on a book cover are also a recommendation, tapping into the human assumption that we can trust what people say. Suggesting that ‘if you liked X you may like Y’ gives us a comparison, tells us why this is being recommended. It amuses my family that I can rarely spend time in a bookshop without giving and receiving suggestions from other customers. There’s some evidence (such as this survey from an advertising company) suggesting that this makes a bigger impact if the recommendation seems personal. From a science point of view this is the danger of anecdotes – that we give higher credence to isolated facts than a bigger picture would justify. It seems unlikely that we are actually going to change human nature, though. I remember reading about ‘Salesmen’, people who would spend a lot of time recommending products or concepts, from the book The Tipping Point. I don’t know if more recent research still shows that a small number of people do a lot of recommending. Twitter and the rest of the internet has given us all the chance to be persuaders.

Why It’s Useful (Here comes the Science Bit)

If one bird in a flock starts flapping, so do the others. Flapping is infectious. Mass panic may result from a minor cause, an example of something we call emergence. We tend to act like this more when it involves problems than benefits, probably because in an evolutionary sense we are fine-tuned to trust others when they hint at danger. The Tiger That Isn’t is a great bok explaining the mathematical reasons for the human ability to spot patterns so well. If others in a group are acting scared, then maybe we need to be scared too. All kinds of superstitions, tragedies and historical events have happened because we, like so many other species, have evolved to ‘follow the crowd’. It’s not always right. But it’s right often enough that overall, on average, in the long-term, it’s a useful adaptation. This is why we trust other people’s opinions, modified by who they are. Humans rely on reputation to ‘filter’ our opinions of other people, as ideas like the Prisoner’s Dilemma show. So how could we use the idea of reputation to help when it seems that academic achievements don’t tell us everything we need to know?

Giving Online References

During the discussion, we commented that the best evidence of competence would be to show what is actually produced. It’s expected that photographers and artists will take a portfolio to interviews – why not others? This develops quite quickly into the idea that it shouldn’t be hard, in some professions, to have a permanent online portfolio of your achievements. Everyone who has an ‘About’ page on their blog does this in some way. Adding speaking engagements, videos of what they’ve done or bibliographies extends this. Of course, your online presence can be negative too; just think about advice given to job-seekers about checking what strangers can see on their FaceBook page. I Google myself fairly regularly to see what my students would find, assuming they were ever that bored.

My blog could be seen as a portfolio of what I do and how I do it. If, of course, I planned to use it that way. At the moment I blog and tweet discreetly. I’m sure it’s possible to find out, if you’re so inclined. I’m sure people who know me in person might be able to recognise my personality in my comments, but I’m equally sure the chances are pretty slim of anyone doing so accidentally. This has been a deliberate choice so that I can say what I like, although I try very hard to stay professional. But many other educators (and others) use their blogs as an ongoing record. Their choice and, I’m sure, a very effective one. The potential for this strikes me as so significant that my wife and I are currently considering grabbing ‘named’ domains for our sons,so they have that opportunity in the future.

I’ve given written references for kids. Not often, as I’m perfectly happy to stay as a ‘form tutor’ in our pastoral system, but a few kids from my DofE groups have asked. I gave a personal reference for a friend of mine as I’d also worked with him; it’s odd in a way that I couldn’t honestly write, “Look, he’s godless father to my son, a great cook and I’d trust him with my last penny. Just hire him!” Instead I had to comment on his personal qualities, of which he has many. I still treasure the testimonial I was given by an editor at the end of a work experience, back in my university days. I didn’t continue in journalism or use the letter she wrote me, but the idea was fantastic. (Thanks, Vanessa!) We rely on word of mouth to choose electricians and builders, mechanics and masseurs, presumably because you can’t demonstrate those skills on a webpage or a quarter-page advert. So why not give professional and character references online?

I imagine some kind of social networking site. It would need to use real names and have a contact email address for verification, but that’s not impractical these days. You would recommend or vouch for other people, and say why or what for. Categories of recommendation would be easy enough to evolve, or it could be a simple paragraph, perhaps with hashtags of some kind. Etiquette would probably include listing any potential conflicts of interest. There wouldn’t need to be a limit to the number of recommendations, but viewers would be able to see how many people you gave these recommendations to. It would need to show if these recommendations were reciprocal. And at the same time, they would be able to see who recommended you. These references could have an expiration date, or be ongoing until you cancelled them. I imagine that those taking part would also need to be able to refuse references from other members too.

As I commented at the time, this is not a million miles from the concept of ‘Whuffie’ (Wiki explanation here) as suggested by Cory Doctorow. I don’t know whether trying to define this in numbers would be useful. Maybe members would get a score for the number of actual recommendations they have, and a cumulative one where a bonus score is added on, based on how many each ‘referee’ has. The more recommendations a person had in their field, the more their opinion would be seen as significant in turn. I’m sure some people’s judgements would be seen as more ‘valuable’ than others, something we already see with advertising. Some of this will depend on the reader or viewer, of course; I am far more likely to seek out a book if it has been recommended by Neil Gaiman (author of American Gods, which I think is fantastic) than Stephanie Meyer (I found The Host only okay and couldn’t make myself read the Twilight books), while others would say the opposite. Students applying to university are limited with the number of referees they can give; imagine if instead they could show off the opinions of their current employer, fellow volunteers, youth leaders or subject teachers.

This kind of system would offer everyone the chance to offer testimonials, in a rather more constructive way than FaceBook’s ‘Like’ button. I think to be useful it would have to be combined with exam results and standardised assessments rather than stand alone. (Although it’s not hard to imagine an automated system to include the exam results as part of the list of recommendations.) A good postal system allowed us to move from the written testimonial system to one where we responded to individual requests for information. Now the web would give us the chance to combine the best of both worlds, in the same way that small deli shops can now do a lot of business nationally via a website. Like all teachers I treasure the ‘thank you’ notes a few students have given me over the past few years, and the comments at parents’ evenings. Imagine how valuable it would be to be able to offer that kind of ongoing recommendation for somebody, backed up by your own long-term presence online. I’d like to feel that by sharing my good opinion of my students I’d helped them, somehow. Wouldn’t you?