Over the past fortnight, secondary schools in England have submitted centre assessment grades for their Year 11 pupils to the exam boards. This has happened in response to GCSEs being cancelled this year.

Coming up with these grades has been a huge undertaking for teachers – one done with minimal guidance and training.

The next step is moderation by exam boards, before grades are issued to pupils in August.

But an exercise carried out by FFT gives an indication of the challenges facing Ofqual and the exam boards.

The data we collected

Between 28 April and 1 June, FFT ran a statistical moderation service which allowed schools to submit preliminary centre assessment grades they were proposing for their pupils. In return they received reports which compared the spread of grades in each subject to historical attainment figures and progress data.

In this blogpost, we’ll take a look at some of the main findings from the service, based on the data of more than 1,900 schools – over half of all state secondaries in England – which had submitted results when the service ended on 1 June.

That’s the date on which the window for secondary schools to submit their proposed grades to the exam boards opened – though it’s worth saying that we don’t know if schools will have submitted the same data to the exam boards as that which we’re analysing here. They may have used the reports they were provided with to amend the mix of grades they were proposing.

Comparing 2020 and 2019 results

Noting that caveat, it is nonetheless worth analysing the data that schools submitted.

We’re going to compare it to published, school-level results for 2019 – only including the results of schools for which we have both 2019 and 2020 data and only looking at subjects for which we have enough data to form reliable conclusions.[1]

So what does this comparison show?

Well, at the top level, this year’s teacher-assessed grades are higher than those awarded in 2019 exams. In every subject we’ve looked at, the average grade proposed for 2020 is higher than the average grade awarded last year. In most subjects, the difference is between 0.3 and 0.6 grades.

Starting with the subjects that almost all pupils sit, the average of all the teacher-assessed grades in English language comes out as 5.1 – that is, a little above a grade 5. That compares to an average grade of 4.7 last year. For English literature, a slightly smaller increase in average grade is seen, from 4.8 last year to 5.0 this year, while in maths the average proposed grade for 2020 is 5.0, compared to 4.7 for 2019.

Looked at another way, were these proposed grades to be confirmed, the share of pupils awarded a grade 4 or above would increase from 71.4% to 80.8% in English language, from 73.7% to 79.0% in English literature, and from 72.5% to 77.6% in maths.

The chart below shows how the proposed results for 2020 compare to 2019’s actual results for all subjects in terms of average grade.

The subject with the biggest difference between average grade awarded in 2019 and proposed for 2020 is computer science, with nearly a grade difference: 5.4 for 2020, compared to an average grade of 4.5 for 2019. Of the 24 subjects we’ve looked at, in 10 of them there’s a difference of half a grade or more between the average proposed grade for 2020 and the average grade awarded in 2019.

At the other end of the scale, the smallest differences between average proposed 2020 grades and 2019 grades are around a third of a grade or less: in English literature, combined science, religious studies, maths and art and design (photography).

The next chart shows the 2019 grade distribution for the individual subjects we’ve looked at, compared to the teacher-assessed figures we have for 2020.

Looking across all subjects, if these grades were given out this summer then we would see the share of grade 9s increase from 4.8% of all grades awarded to 6.3%. The share of results receiving a grade 7 or above would increase from 23.4% of all grades to 28.2%, while the share of results receiving a grade 4 or above would increase from 72.8% to 80.7%.

The upshot

What should we make of this?

Well, first of all it’s worth saying again that we don’t know that these will be the results that schools will have been submitted to the exam boards.

The precise reason that schools submitted data to FFT’s statistical moderation service was to seek some assistance in determining what grades to set. Many will have used the reports that they received to tweak the grades they were proposing before they are submitted to the exam boards.

That said, around 1,000 schools submitted data to FFT two or more times. On average, there was some change in the grades proposed between these different iterations, but in most subjects the impact was relatively small: a reduction in average grade of 0.1 of a grade.

That suggests that the proposed grades submitted to the exam boards will still have been above those awarded last year.

Consequently, it seems likely that Ofqual and the exam boards will have to apply statistical moderation to the grades submitted by schools, bringing them down on average.

This will be a hugely complex task, the likes of which have never been done before. As well as proposed grades, schools were required to submit rank orderings of their pupils, and it seems likely that these will be used to shift some pupils down from one grade to the next.

Reflecting on the difficult task faced by schools

It’s worth taking a moment to consider the difficulty of the task that schools had, and think about why their proposed grades were higher than those awarded last year.

First, in terms of difficulty, teachers were being asked to form an assessment of the level of attainment that each child had reached – taking into account evidence from a range of sources, but done at a time when schooling has been significantly disrupted.

Lest we forget, 9-1 grades for GCSEs also haven’t been around for that many years yet. In some subjects, teachers only have one year of past results to go on. All other things being equal, you would expect the second cohort of pupils taking a qualification to do a bit better than last year’s, as teachers have an extra year of experience under their belts. An approach called comparable outcomes is normally applied to exam results to account for this, but that won’t have been factored in to the proposed grades that schools have come up with.

It’s also much easier to distinguish between two pupils using marks from an exam. As a thought experiment, imagine two pupils thought to be on the 5/4 boundary who have produced a similar quality of work at school. It would be unfair for the teacher to give one a 5 and the other a 4 – but an exam would rule definitively on the matter, for better or worse.

All things said and done, then, schools have had an incredibly difficult task – albeit one matched in difficulty by that now facing the exam boards and Ofqual.

You can read more analysis of this data in two further posts – one looking at centre variability in results, and another looking at the severity of grading of different subjects.

Want to stay up-to-date with the latest research from FFT Education Datalab? Sign up to Datalab’s mailing list to get notifications about new blogposts, or to receive the team’s half-termly newsletter.

Notes

1. Only schools with more than 25 entries in a given subject in both 2019 and 2020 have been included in this analysis, and we’re only looking at subjects where there were 100 or more such schools. This leaves us with a total of 1,916 schools in the analysis.