Evaluating Perspectival Biases in Cross-Modal Retrieval
2510.26861v1
cs.IR, cs.CL, H.3.3; I.2.7; I.2.10
2025-11-04
Авторы:
Teerapol Saengsukhiran, Peerawat Chomphooyod, Narabodee Rodjananant, Chompakorn Chaksangchaichot, Patawee Prakrankamanant, Witthawin Sripheanpol, Pak Lovichit, SarChaksaana Nutanong, Ekapol Chuangsuwanich
Abstract
Multimodal retrieval systems are expected to operate in a semantic space,
agnostic to the language or cultural origin of the query. In practice, however,
retrieval outcomes systematically reflect perspectival biases: deviations
shaped by linguistic prevalence and cultural associations. We study two such
biases. First, prevalence bias refers to the tendency to favor entries from
prevalent languages over semantically faithful entries in image-to-text
retrieval. Second, association bias refers to the tendency to favor images
culturally associated with the query over semantically correct ones in
text-to-image retrieval. Results show that explicit alignment is a more
effective strategy for mitigating prevalence bias. However, association bias
remains a distinct and more challenging problem. These findings suggest that
achieving truly equitable multimodal systems requires targeted strategies
beyond simple data scaling and that bias arising from cultural association may
be treated as a more challenging problem than one arising from linguistic
prevalence.