spectator.com web signal

Durham Professor Quits to Expose AI Essay Cheating Gap

openai anthropic education ai-cheating higher-education

Key insights

  • Hamilton has been at Durham since 1991 and chaired Philosophy examiners since 2016, giving his resignation significant institutional weight.
  • Turnitin cannot reliably detect AI-generated essays, and AI companies are actively removing the remaining detectable signals.
  • The at-home exam model introduced during COVID is the root structural cause Hamilton identifies, not student ethics alone.

Why this matters

AI practitioners and founders building LLMs or AI writing tools are now directly implicated in undermining academic credentialing systems — Hamilton names ChatGPT and Claude explicitly as the instruments of cheating, meaning Anthropic and OpenAI face growing pressure to address downstream misuse in education. The failure of Turnitin-style detection signals a broader market reality: post-hoc AI detection is a losing arms race as model providers actively remove detectable artefacts, which has implications for any sector relying on AI-output detection for compliance or trust. Universities represent one of the first large institutional domains forced to redesign core processes (assessment architecture) rather than add AI detection layers on top — that precedent will repeat across legal, journalistic, and professional certification contexts.

Summary

Andy Hamilton, Professor of Philosophy at Durham University and Chair of the Board of Examiners for Philosophy since 2016, resigned that post to publicly call out what he says is an active institutional failure: students using professional-grade AI tools like ChatGPT or Claude are earning first-class results while honest students are penalised by comparison, and Durham's leadership is doing nothing meaningful about it. The structural problem is the COVID-era switch from supervised sit-down exams to at-home computer assessments. That shift removed the conditions under which academic integrity could be enforced. Turnitin and similar plagiarism detection tools cannot reliably identify AI-generated essays, and Hamilton notes that AI companies are actively closing the gap: the detectable "tells" — such as hallucinated references — are being "aggressively stamped out" by those same companies, making detection harder over time. Essentially: (Durham University, Turnitin) are on opposite sides of a widening gap the institutions are not closing. - Affluent students with access to paid, professional AI versions gain a structural advantage over peers using free tools or none at all. - Hamilton argues it is now impossible to mark essays fairly under the at-home system in the era of ChatGPT. - His proposed fix is straightforward: return to sit-down examinations as the primary assessment method, with vivas and suspension of anonymous marking as interim measures. The deeper issue is not detection technology but assessment architecture — universities adopted a model during COVID that has no viable integrity mechanism in an AI-saturated environment.

Potential risks and opportunities

Risks

  • If Durham and peer institutions do not return to supervised exams, degree classifications from 2020 onward face credibility challenges with graduate employers who may begin discounting humanities credentials.
  • Turnitin and AI-detection vendors risk accelerating commercial irrelevance as AI companies deliberately close detectable signal gaps, potentially triggering customer churn from universities that have paid for detection licences.
  • Honest students at Durham and similar institutions face a compounding disadvantage each assessment cycle the at-home model persists, creating potential grounds for complaints to the Office for Students or legal challenges over fair assessment conditions.

Opportunities

  • EdTech vendors offering secure remote proctoring (Honorlock, Proctorio, Respondus) gain a renewed opening to pitch supervised digital exam infrastructure as the middle path between in-person halls and unmonitored at-home essays.
  • Oral examination (viva) software and scheduling platforms have an underserved market in UK higher education if Hamilton's recommendation gains traction across philosophy and humanities departments.
  • Anthropic and OpenAI face reputational pressure but also a product opportunity: building verifiable student-mode or academic-integrity variants of Claude and ChatGPT that institutions could certify as assessment-safe, differentiating from uncontrolled consumer tiers.

What we don't know yet

  • Whether Durham University's leadership has formally responded to Hamilton's resignation or committed to any policy change as of June 2026.
  • Which specific professional AI tiers (ChatGPT Plus, Claude Pro) are most associated with the grade inflation Hamilton describes, and whether pricing data supports his class-disparity claim.
  • Whether any UK higher education regulator (such as the Office for Students) is reviewing at-home exam policies sector-wide, not just at Durham.