Investigating the Association Between Text-Based Indications of Foodborne Illness from Yelp Reviews and New York City Health Inspection Outcomes (2023)
2510.16334v1
cs.IR, cs.CL
2025-10-22
Авторы:
Eden Shaveet, Crystal Su, Daniel Hsu, Luis Gravano
Abstract
Foodborne illnesses are gastrointestinal conditions caused by consuming
contaminated food. Restaurants are critical venues to investigate outbreaks
because they share sourcing, preparation, and distribution of foods. Public
reporting of illness via formal channels is limited, whereas social media
platforms host abundant user-generated content that can provide timely public
health signals. This paper analyzes signals from Yelp reviews produced by a
Hierarchical Sigmoid Attention Network (HSAN) classifier and compares them with
official restaurant inspection outcomes issued by the New York City Department
of Health and Mental Hygiene (NYC DOHMH) in 2023. We evaluate correlations at
the Census tract level, compare distributions of HSAN scores by prevalence of
C-graded restaurants, and map spatial patterns across NYC. We find minimal
correlation between HSAN signals and inspection scores at the tract level and
no significant differences by number of C-graded restaurants. We discuss
implications and outline next steps toward address-level analyses.
Ссылки и действия
Дополнительные ресурсы: