When algorithms define kids by postcode: UK exam results chaos reveal too much reliance on data analytics

Analysis: Updated: AI, ML, and data analytics are valuable tools -- but the human factor can be too easily lost.

What's the new normal for online education? Coursera's Leah Belsky has some ideas

"I am not my postcode." 

This slogan was displayed among a sea of students in Scotland protesting their grades over the past week. 

Unable to take their exams during 2020 due to the COVID-19 pandemic and with schools shut back in March, students received their grades for an exam-free year last week for Scottish Qualifications Authority (SQA) courses by letter, text, and email. 

When they opened the envelopes containing the grades necessary for higher courses or university, many were left disappointed. 

It then emerged that the exam board had decided to lower tens of thousands of grades from the original awards recommended by teachers. In total, 124,564 exam results -- roughly 25% of those issued by the SQA -- were downgraded, according to The Independent

The grading system used by the exam board's moderators leveraged data based on the past performance of schools. The pass rate for students undertaking higher courses in deprived locations across Scotland was reduced by 15.2%, in comparison to 6.9% in more affluent areas. 

While the overall aim was to clamp down on any predicted grades issued by teachers seen as too generous, the use of an algorithm to factor in historic school data, rather than a pupil's individual performance, led to accusations of bias and discrimination. 

Ian Murray, Edinburgh South MP and Shadow Secretary of State for Scotland, said in a blog post that without physical exams being a possibility, awarding grades was "always going to be difficult to get right." However, Murray believes that the algorithm "hard-bake[d] inequality into the system."

"The method adopted has decided children's future based on their postcode or the previous performance of their school, not their performance in the classroom assessed by the people who know them the best -- their teachers," Murray commented.

Students were left devastated by grades lower than anticipated. Parents were infuriated. For some teachers who worked to the bone for months after being thrust into a remote and learn-from-home setup without warning, the dismissal of their recommendations in deference to an algorithm may have been equally crushing. 

Subsequent outrage prompted Scotland's educational department to quickly u-turn on the use of the algorithm, restoring the original, predicted grades offered by teachers. Education Secretary John Swinney apologized for the sense of "unfairness," adding that it was "deeply regrettable" the government "got this wrong," as reported by the BBC

The backtracking caused attention to turn on England's looming A-level results. 

On Thursday, approximately 300,000 students across Northern Ireland, Wales, and England received their A-level exam grades. In England, in total, 27,7% of A-level entries are recorded as As or A*s, the BBC reported on Thursday morning. However, 35.6% of results have been downgraded by one level and 3.3% of results have been adjusted by at least two grades.

See also: Back to school: Best tech gear for the remote student

Before being made public, the National Union of Students asked for teacher-predicted grades to be used rather than the algorithm, but Education Secretary Gavin Williamson claimed the system is "fundamentally a fair one." 

When speaking to The Telegraph, Williamson said 'inflating' grades risked devaluing 2020 results, while also being unfair on the classes of 2019 and 2021. However, this suggests that left in the hands of teachers alone, all students would receive grades beyond what they could have achieved -- a concept that I, as a former teacher, dismiss entirely. 

A teacher based in England who specializes in educational policy also spoke to us ahead of the results, commenting: 

"Historic school performance, on the whole, isn't too bad for middle to low students -- but weak students in good schools will have theirs inflated and strong students in poorer schools will be penalized."

The Department for Education announced a "triple lock" after Scotland's uproar. The changes keep the standardization algorithm in place, but appeals can be launched based on mock exam estimates. 

Universities have also been asked to keep places open for students involved in the appeals process. Students who feel that they have been treated unfairly also have the option to take a written exam in the fall. 

Northern Ireland has followed suit with similar measures. Wales did not intend to change its position, but now, has pledged that students will not receive grades lower than their AS levels -- the first year of two-year A-level studies. 

There was jubilation on Twitter as some students celebrated securing their places at university. However, others were left bereft, and the standardization process may have caused the same issues as Scotland, with some schools -- including Leyton Sixth form College in East London -- experiencing high proportions of downgrades. 

Speaking to Sky, the college's headteacher said 47% of students had their results downgraded in the sixth form, which appears to be an extremely high proportion in one school. 

After reviewing the results, the teacher discussing the issue with ZDNet said the trends in their school -- notably, in a relatively affluent area -- are "broadly comparable" to last year. However, the teacher said that students' prior attainment -- such as GCSEs -- seemed to have "quite an impact." 

"I can see students who didn't do well at GCSE because of personal reason X be awarded the same as a weak candidate," the teacher added. "But, I can see that some individuals have been downgraded across all their subjects and I don't know why."

The teacher noted that the current system was originally designed to maintain trends in metadata, and as Scotland has less data to manage, their results may be more "skewed" than England's will be overall -- but, considering the importance of these grades for university hopefuls, "what they've done is probably fairer than England."

As Scotland grappled with the fallout from the use of the standardization algorithm, Scotland's Education Secretary noted that the downgrades led to "young people feeling their future had been determined by statistical modeling rather than their own ability," -- and therein lies the problem. 

Artificial intelligence (AI), machine learning (ML), and data analytics based on statistical modeling, when used in this manner, has taken the human element out of a decision that can have a real impact on a student's future -- and to make matters worse, in the context of a pandemic. 

A digital divide, of a sort, may have also formed by each country choosing a different method in assessing results. Legal challenges may follow as students on each side of the Scotland-England border could find themselves competing for the same university places with grades based on different approaches -- not to mention the UCAS clearing battle that will take place as students with low grades fight to attend university. 

The 'postcode lottery' element is part of a broader issue: how can you determine a fair reward for students finishing their education without the option to moderate exams? 

This is an extraordinary situation, and when regulators, exam boards, and educational professionals have to come up with a nationwide means of issuing grades to thousands upon thousands of students without exams, the lure of relying on computational models is a strong one. 

"When designing algorithms, organizations often have to balance complex and competing aims and harms to different groups of individuals," Camilla Winlo, Director of Consultancy at DQM GRC told ZDNet. "This is complicated, and in the case of the 'exam' results, made even more so because this is the first time in history that exam boards have had to create fair assessment grades for individuals in circumstances like this."

Input data, rapid decisions, and the problem appears to be solved. However, input can bias outcomes.

Speaking to ZDNet, Alan Gibson, VP EMEA at Alteryx said that if data introduced to an algorithm includes socio-economical bias from the start, this can "propagate into future decision-making."

"While the statistical model has been built on the characteristics of historically "high-achieving" exam results, the inputs were flawed: deprived schools are more likely to [have] bad past performances due to lack of funding, unlike their affluent counterparts," Gibson commented. "In this instance, use of bias historical data has resulted in top students in deprived schools being severely downgraded -- and in some cases as extreme as a pass to fail."

"At the end of the day, artificial intelligence is just maths -- nothing more, nothing less," Gibson added. "Despite being accused as such, artificial intelligence doesn't make moral judgments, nor is it inherently biased. Instead, we must look at the biases being held by the historical data, and even by the creators of a model, to truly understand where we might be shaping these outcomes and how to eliminate them."

Winlo also noted that in the UK's Information Commissioner's Office (ICO)'s draft AI auditing framework fairness requires personal data to be handled "in ways that people would reasonably expect and not use it in ways that have unjustified adverse effects on them." The executive commented:

"Exam boards will have wanted to design a process that was a fair as possible. However, when designing an algorithm intended to smooth out and prevent such risks, it can bring in other risks. In this case, it means that individual marks are not based solely on their own work and effort, but on the work and effort put in by other people that the individual can't influence. 

That particularly disadvantages individuals who are outperforming their peers at lower-performing schools. You could argue that this cohort is particularly deserving of support and particularly undeserving of additional hurdles to succeed."

Update 11.05 am: The UK government has now released documents explaining the reasoning behind the downgrades. Ofqual said that relying on teacher-submitted assessments alone would lead to "implausibly high" national results, and this "optimism" could "undermine" faith in 2020 results. 

The Direct Centre Performance model (DCP) was chosen as the statistical model that "most accurately predicted students' grades in a way that did not systematically affect groups of students with particular protected characteristics." 

"That prediction is based on the historical performance of the school or college in that subject taking into account any changes in the prior attainment of candidates entering this year compared to previous years," the regulator says. "This was fine-tuned to take account of known issues such as centers with small cohorts of students, small-entry subjects, and tiered subjects."

Ofqual denied any evidence of systematic bias. A report on 2020 exam results will be published later this year.

"We know that, just as in any year, some students will be disappointed with their results. Some students may think that, had they taken their exams, they would have achieved higher grades," Ofqual said. "We will never know. But for those students who do wish to improve their grades, there will be an autumn exam series." 

Previous and related coverage


Have a tip? Get in touch securely via WhatsApp | Signal at +447713 025 499, or over at Keybase: charlie0