Hire Your Anthropologist! Rethinking Culture Benchmarks Through an Anthropological Lens
2510.05931v1
cs.CL, cs.CY
2025-10-09
Авторы:
Mai AlKhamissi, Yunze Xiao, Badr AlKhamissi, Mona Diab
Abstract
Cultural evaluation of large language models has become increasingly
important, yet current benchmarks often reduce culture to static facts or
homogeneous values. This view conflicts with anthropological accounts that
emphasize culture as dynamic, historically situated, and enacted in practice.
To analyze this gap, we introduce a four-part framework that categorizes how
benchmarks frame culture, such as knowledge, preference, performance, or bias.
Using this lens, we qualitatively examine 20 cultural benchmarks and identify
six recurring methodological issues, including treating countries as cultures,
overlooking within-culture diversity, and relying on oversimplified survey
formats. Drawing on established anthropological methods, we propose concrete
improvements: incorporating real-world narratives and scenarios, involving
cultural communities in design and validation, and evaluating models in context
rather than isolation. Our aim is to guide the development of cultural
benchmarks that go beyond static recall tasks and more accurately capture the
responses of the models to complex cultural situations.
Ссылки и действия
Дополнительные ресурсы: