Ofsted is to deploy senior inspectors to 鈥渟hadow鈥 inspection teams in a bid to test the consistency of judgments under its new report cards system. The watchdog has confirmed it will introduce extra quality assurance (QA) visits focused specifically on consistency, validity and reliability when inspections resume in November. It follows concerns that the introduction of more inspection areas and grades will make judgments less reliable. Academics have already expressed concerns about reliability. We want to know how consistent our inspections are, through testing Under existing QA checks, senior inspectors shadow less experienced team members during school visits to observe them working, provide guidance and report back to line managers. But plans to build on this by carrying out more shadow inspections to 鈥渁ssess consistency鈥 and ensure grades are 鈥渁s valid and as reliable as possible鈥. The watchdog has not said how many more visits it will carry out each year. Writing for Schools Week, Rory Gribbell, Ofsted鈥檚 strategy director, and Dr Verena Braehler, its research and evaluation director, said: 鈥淲e also want to know how consistent our inspections are, through testing.鈥 Critics have welcomed the move as a 鈥済ood starting point鈥, but have called for regular, 鈥渢ransparent鈥 reporting of the results. They have also called for oversight from an independent, external body. What is changing? The new measures essentially mean a ramping-up of Ofsted鈥檚 QA process. 鈥淭he senior inspector鈥檚 role will be specifically to ensure the consistency of inspection outcomes as part of a larger process,鈥 said Gribbell and Braehler. 鈥淎fter each inspection, any initial differences between senior inspectors and inspection teams will be analysed by our research and evaluation team.鈥 Feedback will then be considered alongside 鈥渨ider consistency activity鈥, which will include inspectors being given simulations of real-world inspections to evaluate their training and judgments. Inspection toolkits or training could then be tweaked. Will shadow visits really show inconsistencies? Dr Tim Leunig Gribbell and Braehler said that during QA visits, senior inspectors will 鈥渁dvise and guide the inspection team to the right result鈥 before reporting back to Ofsted about areas where they reached different conclusions. But Dr Tim Leunig, a professor at the London School of Economics and former DfE policy adviser, is concerned junior inspectors will simply 鈥渇ollow鈥 the senior inspector鈥檚 judgment. He said a fairer measure would be if junior and senior inspectors filed separate judgments 鈥渨ithout having seen the other verdict, and both are revealed together鈥. Transparency is key University College London academics John Jerrim and Dr Sam Sims, and The University of Southampton鈥檚 Professor Christian Bokhove, have long argued for greater scrutiny of inspection reliability. They described Ofsted鈥檚 move to carry out more shadow inspections as 鈥渁 real positive鈥. But while 鈥渁 good starting point鈥, they warned the watchdog was setting 鈥渁 low bar鈥. They fear the process is 鈥渓ikely to only show up really quite major instances of inconsistencies鈥, analogous to 鈥渨hen two referees disagree on whether a player is standing three yards offside鈥. More work would be needed to prove inspections had a high degree of consistency and reliability. Crucially, they said 鈥渞esults of research on consistency [must be] transparently reported鈥. They also called for a 鈥渃lose external overview鈥 of this work, 鈥渋deally鈥onducted by an independent organisation鈥. Dearth of reliability evaluations But Ofsted has 鈥渢o start somewhere鈥, they said, adding that 鈥渇or 30 years we have had pretty much nothing else鈥. The last evaluation of Ofsted grade reliability was published in 2017, before the latest inspection framework was introduced. It found two inspectors tended to agree on which grade to award a school, but the report only looked at 鈥渟hort鈥 inspections of schools already rated 鈥榞ood鈥 or better. Other studies have shown wider discrepancies. One by Bokhove, Jerrim and Sims in 2023 found that primary schools assigned a female lead inspector were around one third more likely to receive an 鈥榠nadequate鈥 judgment. 鈥楴ot a one-off鈥 Jerrim and Bokhove stressed Ofsted鈥檚 new work on consistency must not be 鈥渙ne and done鈥 and 鈥渘eeds to be a much longer-term endeavour.鈥 Ofsted has said results of the new consistency work will be published next year. It is not clear exactly when, but the watchdog said it was keen to make any necessary updates to the inspection framework ahead of the 2026-27 academic year. But Gribbell and Braehler insisted the work not be a 鈥渙ne-off exercise鈥, and that regular reports will be published.