User Perceptions of Privacy and Helpfulness in LLM Responses to Privacy-Sensitive Scenarios
2510.20721v1
cs.CL, cs.AI, cs.HC
2025-10-25
Авторы:
Xiaoyuan Wu, Roshni Kaushik, Wenkai Li, Lujo Bauer, Koichi Onoue
Abstract
Large language models (LLMs) have seen rapid adoption for tasks such as
drafting emails, summarizing meetings, and answering health questions. In such
uses, users may need to share private information (e.g., health records,
contact details). To evaluate LLMs' ability to identify and redact such private
information, prior work developed benchmarks (e.g., ConfAIde, PrivacyLens) with
real-life scenarios. Using these benchmarks, researchers have found that LLMs
sometimes fail to keep secrets private when responding to complex tasks (e.g.,
leaking employee salaries in meeting summaries). However, these evaluations
rely on LLMs (proxy LLMs) to gauge compliance with privacy norms, overlooking
real users' perceptions. Moreover, prior work primarily focused on the
privacy-preservation quality of responses, without investigating nuanced
differences in helpfulness. To understand how users perceive the
privacy-preservation quality and helpfulness of LLM responses to
privacy-sensitive scenarios, we conducted a user study with 94 participants
using 90 scenarios from PrivacyLens. We found that, when evaluating identical
responses to the same scenario, users showed low agreement with each other on
the privacy-preservation quality and helpfulness of the LLM response. Further,
we found high agreement among five proxy LLMs, while each individual LLM had
low correlation with users' evaluations. These results indicate that the
privacy and helpfulness of LLM responses are often specific to individuals, and
proxy LLMs are poor estimates of how real users would perceive these responses
in privacy-sensitive scenarios. Our results suggest the need to conduct
user-centered studies on measuring LLMs' ability to help users while preserving
privacy. Additionally, future research could investigate ways to improve the
alignment between proxy LLMs and users for better estimation of users'
perceived privacy and utility.
Ссылки и действия
Дополнительные ресурсы: