On the Generalizability of AI-Generated Text Detection

dc.contributor.authorDavid, Amir
dc.date.accessioned2026-01-20T14:43:32Z
dc.date.available2026-01-20T14:43:32Z
dc.date.issued2026-01-20
dc.date.submitted2026-01-14
dc.description.abstractAs large language models (LLMs) become ubiquitous, reliably distinguishing their outputs from human writing is critical for academic integrity, content moderation, and preventing model collapse from synthetic training data. This thesis examines the generalizability of LLM-text detectors across evolving model families and domains. We compiled a comprehensive evaluation dataset from commonly-used human corpora and generated corresponding samples using recent OpenAI and Anthropic models spanning multiple generations. Comparing the state-of-the-art zero-shot detector (Binoculars) against supervised RoBERTa/DeBERTa classifiers, we arrive at four main findings. First, zero-shot detection fails on newer models. Second, supervised detectors maintain high TPR in-distribution but exhibit asymmetric cross-generation transfer. Third, commonly reported metrics such as AUROC can obscure poor performance at deployment-relevant thresholds: detectors achieving high AUROC yield near-zero TPR at low FPR, and existing low-FPR evaluations often lack statistical reliability due to small sample sizes. Fourth, through tail-focused training and calibration, we reduce FPR by up to 4× (from ~1% to ~0.25%) while maintaining 90% TPR. Our results suggest that robust detection requires continually re-calibrated, model-aware pipelines rather than static universal detectors.
dc.identifier.urihttps://hdl.handle.net/10012/22854
dc.language.isoen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectartificial intelligence
dc.subjectdeep learning
dc.subjectlarge language model
dc.subjectOpenAI
dc.subjectChatGPT
dc.subjectAnthropic
dc.subjectClaude
dc.subjectdetection
dc.subjectrobustness
dc.subjectSOCIAL SCIENCES::Statistics, computer and systems science::Informatics, computer and systems science::Computer and systems science
dc.subjectTECHNOLOGY::Information technology::Computer science::Software engineering
dc.subjectTECHNOLOGY::Information technology::Computer science
dc.subjectmachine learning
dc.subjectzero-shot
dc.subjectsupervised learning
dc.subjectstate of the art
dc.subjectllm
dc.subjectbert
dc.titleOn the Generalizability of AI-Generated Text Detection
dc.typeMaster Thesis
uws-etd.degreeMaster of Mathematics
uws-etd.degree.departmentDavid R. Cheriton School of Computer Science
uws-etd.degree.disciplineComputer Science
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0
uws.contributor.advisorKerschbaum, Florian
uws.contributor.affiliation1Faculty of Mathematics
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
David_Amir.pdf
Size:
476.89 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: