PRSM: A Measure to Evaluate CLIP's Robustness Against Paraphrases

2511.11141v1 cs.CL, cs.CY, cs.LG 2025-11-18
Авторы:

Udo Schlegel, Franziska Weeber, Jian Lan, Thomas Seidl

Abstract

Contrastive Language-Image Pre-training (CLIP) is a widely used multimodal model that aligns text and image representations through large-scale training. While it performs strongly on zero-shot and few-shot tasks, its robustness to linguistic variation, particularly paraphrasing, remains underexplored. Paraphrase robustness is essential for reliable deployment, especially in socially sensitive contexts where inconsistent representations can amplify demographic biases. In this paper, we introduce the Paraphrase Ranking Stability Metric (PRSM), a novel measure for quantifying CLIP's sensitivity to paraphrased queries. Using the Social Counterfactuals dataset, a benchmark designed to reveal social and demographic biases, we empirically assess CLIP's stability under paraphrastic variation, examine the interaction between paraphrase robustness and gender, and discuss implications for fairness and equitable deployment of multimodal systems. Our analysis reveals that robustness varies across paraphrasing strategies, with subtle yet consistent differences observed between male- and female-associated queries.

Ссылки и действия

Связанные статьи

Intrinsic Meets Extrinsic Fairness: Assessing the Downstream Impact of Bias Miti...

################################# ## Контекст ################################# Large Language Models (LLMs) широко испо...

2025-09-24

LLM Analysis of 150+ years of German Parliamentary Debates on Migration Reveals ...

## Контекст Миграция является одной из кллючевых проблем в политических дебатах Германии, от десятков миллионов мигрант...

2025-09-11

Decoding the Poetic Language of Emotion in Korean Modern Poetry: Insights from a...

## Контекст Уже имеющиеся текстовые модели могут недостаточно точно распознавать эмоции в текстах, особенно когда речь и...

2025-09-06