Reliable Inference in Edge-Cloud Model Cascades via Conformal Alignment
2510.17543v1
cs.LG, eess.SP, stat.ML
2025-10-22
Авторы:
Jiayi Huang, Sangwoo Park, Nicola Paoletti, Osvaldo Simeone
Abstract
Edge intelligence enables low-latency inference via compact on-device models,
but assuring reliability remains challenging. We study edge-cloud cascades that
must preserve conditional coverage: whenever the edge returns a prediction set,
it should contain the true label with a user-specified probability, as if
produced by the cloud model. We formalize conditional coverage with respect to
the cloud predictive distribution, and introduce a conformal alignment-based
(CAb) cascading mechanism that certifies this property with user control over
the risk level. Our method casts escalation from edge to cloud models as a
multiple-hypothesis testing (MHT) problem, tailoring conformal alignment (CA)
to select which inputs can be safely handled at the edge. The proposed CAb
model cascading method yields statistical guarantees on the average fraction of
edge decisions that satisfy cloud-level conditional coverage. The procedure
applies to arbitrary edge prediction sets, including variants of conformal
prediction (CP), and exposes a tunable trade-off among coverage, deferral rate,
and set size. Experiments on CIFAR-100 image classification and the TeleQnA
question-answering (QA) benchmark show that the proposed CAb cascade maintains
the target conditional coverage for edge predictions while substantially
reducing offloading to the cloud and incurring modest increases in
prediction-set size.