CHORD: Customizing Hybrid-precision On-device Model for Sequential Recommendation with Device-cloud Collaboration
2510.03038v1
cs.LG, cs.AI, cs.IR
2025-10-07
Авторы:
Tianqi Liu, Kairui Fu, Shengyu Zhang, Wenyan Fan, Zhaocheng Du, Jieming Zhu, Fan Wu, Fei Wu
Abstract
With the advancement of mobile device capabilities, deploying reranking
models directly on devices has become feasible, enabling real-time contextual
recommendations. When migrating models from cloud to devices, resource
heterogeneity inevitably necessitates model compression. Recent quantization
methods show promise for efficient deployment, yet they overlook
device-specific user interests, resulting in compromised recommendation
accuracy. While on-device finetuning captures personalized user preference, it
imposes additional computational burden through local retraining. To address
these challenges, we propose a framework for \underline{\textbf{C}}ustomizing
\underline{\textbf{H}}ybrid-precision \underline{\textbf{O}}n-device model for
sequential \underline{\textbf{R}}ecommendation with
\underline{\textbf{D}}evice-cloud collaboration (\textbf{CHORD}), leveraging
channel-wise mixed-precision quantization to simultaneously achieve
personalization and resource-adaptive deployment. CHORD distributes randomly
initialized models across heterogeneous devices and identifies user-specific
critical parameters through auxiliary hypernetwork modules on the cloud. Our
parameter sensitivity analysis operates across multiple granularities (layer,
filter, and element levels), enabling precise mapping from user profiles to
quantization strategy. Through on-device mixed-precision quantization, CHORD
delivers dynamic model adaptation and accelerated inference without
backpropagation, eliminating costly retraining cycles. We minimize
communication overhead by encoding quantization strategies using only 2 bits
per channel instead of 32-bit weights. Experiments on three real-world datasets
with two popular backbones (SASRec and Caser) demonstrate the accuracy,
efficiency, and adaptivity of CHORD.
Ссылки и действия
Дополнительные ресурсы: