LLMs can hide text in other text of the same length
2510.20075v2
cs.AI, cs.CL, cs.CR, cs.LG
2025-10-27
Авторы:
Antonio Norelli, Michael Bronstein
Abstract
A meaningful text can be hidden inside another, completely different yet
still coherent and plausible, text of the same length. For example, a tweet
containing a harsh political critique could be embedded in a tweet that
celebrates the same political leader, or an ordinary product review could
conceal a secret manuscript. This uncanny state of affairs is now possible
thanks to Large Language Models, and in this paper we present a simple and
efficient protocol to achieve it. We show that even modest 8-billion-parameter
open-source LLMs are sufficient to obtain high-quality results, and a message
as long as this abstract can be encoded and decoded locally on a laptop in
seconds. The existence of such a protocol demonstrates a radical decoupling of
text from authorial intent, further eroding trust in written communication,
already shaken by the rise of LLM chatbots. We illustrate this with a concrete
scenario: a company could covertly deploy an unfiltered LLM by encoding its
answers within the compliant responses of a safe model. This possibility raises
urgent questions for AI safety and challenges our understanding of what it
means for a Large Language Model to know something.