Tuning for TraceTarnish: Techniques, Trends, and Testing Tangible Traits

2512.03465v1 cs.CR, cs.CL, cs.IR 2025-12-05

Авторы:

Robert Dilworth

Abstract

In this study, we more rigorously evaluated our attack script $\textit{TraceTarnish}$, which leverages adversarial stylometry principles to anonymize the authorship of text-based messages. To ensure the efficacy and utility of our attack, we sourced, processed, and analyzed Reddit comments--comments that were later alchemized into $\textit{TraceTarnish}$ data--to gain valuable insights. The transformed $\textit{TraceTarnish}$ data was then further augmented by $\textit{StyloMetrix}$ to manufacture stylometric features--features that were culled using the Information Gain criterion, leaving only the most informative, predictive, and discriminative ones. Our results found that function words and function word types ($L\_FUNC\_A$ $\&$ $L\_FUNC\_T$); content words and content word types ($L\_CONT\_A$ $\&$ $L\_CONT\_T$); and the Type-Token Ratio ($ST\_TYPE\_TOKEN\_RATIO\_LEMMAS$) yielded significant Information-Gain readings. The identified stylometric cues--function-word frequencies, content-word distributions, and the Type-Token Ratio--serve as reliable indicators of compromise (IoCs), revealing when a text has been deliberately altered to mask its true author. Similarly, these features could function as forensic beacons, alerting defenders to the presence of an adversarial stylometry attack; granted, in the absence of the original message, this signal may go largely unnoticed, as it appears to depend on a pre- and post-transformation comparison. "In trying to erase a trace, you often imprint a larger one." Armed with this understanding, we framed $\textit{TraceTarnish}$'s operations and outputs around these five isolated features, using them to conceptualize and implement enhancements that further strengthen the attack.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Tuning for TraceTarnish: Techniques, Trends, and Testing Tangible Traits

Авторы:

Abstract

Ссылки и действия

Связанные статьи

A Decentralized Retrieval Augmented Generation System with Source Reliabilities ...

Exposing Citation Vulnerabilities in Generative Engines

Retrieval-Augmented Review Generation for Poisoning Recommender Systems

Навигация