On the Detectability of LLM-Generated Text: What Exactly Is LLM-Generated Text?

2510.20810v1 cs.CL, cs.AI, cs.CY, cs.LG 2025-10-25

Авторы:

Mingmeng Geng, Thierry Poibeau

Abstract

With the widespread use of large language models (LLMs), many researchers have turned their attention to detecting text generated by them. However, there is no consistent or precise definition of their target, namely "LLM-generated text". Differences in usage scenarios and the diversity of LLMs further increase the difficulty of detection. What is commonly regarded as the detecting target usually represents only a subset of the text that LLMs can potentially produce. Human edits to LLM outputs, together with the subtle influences that LLMs exert on their users, are blurring the line between LLM-generated and human-written text. Existing benchmarks and evaluation approaches do not adequately address the various conditions in real-world detector applications. Hence, the numerical results of detectors are often misunderstood, and their significance is diminishing. Therefore, detectors remain useful under specific conditions, but their results should be interpreted only as references rather than decisive indicators.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

On the Detectability of LLM-Generated Text: What Exactly Is LLM-Generated Text?

Авторы:

Abstract

Ссылки и действия

Связанные статьи

SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Be...

Improving Consistency in Retrieval-Augmented Systems with Group Similarity Rewar...

AWARE, Beyond Sentence Boundaries: A Contextual Transformer Framework for Identi...

BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses

PakBBQ: A Culturally Adapted Bias Benchmark for QA

Навигация