Dedelayed: Deleting remote inference delay via on-device correction
2510.13714v1
eess.IV, cs.AI, cs.CV, cs.LG
2025-10-17
Авторы:
Dan Jacobellis, Mateen Ulhaq, Fabien Racapé, Hyomin Choi, Neeraja J. Yadwadkar
Abstract
Remote inference allows lightweight devices to leverage powerful cloud
models. However, communication network latency makes predictions stale and
unsuitable for real-time tasks. To address this, we introduce Dedelayed, a
delay-corrective method that mitigates arbitrary remote inference delays,
allowing the local device to produce low-latency outputs in real time. Our
method employs a lightweight local model that processes the current frame and
fuses in features that a heavyweight remote model computes from past frames. On
video from the BDD100K driving dataset, Dedelayed improves semantic
segmentation accuracy over the stronger of the local-only and remote-only
baselines across all realistic communication network delays beyond 33 ms.
Without incurring additional delay, it improves accuracy by 6.4 mIoU compared
to fully local inference and 9.8 mIoU compared to remote inference, for a
round-trip delay of 100 ms. The advantage grows under longer delays and
higher-motion scenes, as delay-mitigated split inference sustains accuracy more
effectively, providing clear advantages for real-time tasks that must remain
aligned with the current world state.