Accelerating Physical Property Reasoning for Augmented Visual Cognition
2511.03126v1
cs.CV, cs.HC
2025-11-07
Авторы:
Hongbo Lan, Zhenlin An, Haoyu Li, Vaibhav Singh, Longfei Shangguan
Abstract
This paper introduces \sysname, a system that accelerates vision-guided
physical property reasoning to enable augmented visual cognition. \sysname
minimizes the run-time latency of this reasoning pipeline through a combination
of both algorithmic and systematic optimizations, including rapid geometric 3D
reconstruction, efficient semantic feature fusion, and parallel view encoding.
Through these simple yet effective optimizations, \sysname reduces the
end-to-end latency of this reasoning pipeline from 10--20 minutes to less than
6 seconds. A head-to-head comparison on the ABO dataset shows that \sysname
achieves this 62.9$\times$--287.2$\times$ speedup while not only reaching
on-par (and sometimes slightly better) object-level physical property
estimation accuracy(e.g. mass), but also demonstrating superior performance in
material segmentation and voxel-level inference than two SOTA baselines. We
further combine gaze-tracking with \sysname to localize the object of interest
in cluttered, real-world environments, streamlining the physical property
reasoning on smart glasses. The case study with Meta Aria Glasses conducted at
an IKEA furniture store demonstrates that \sysname achives consistently high
performance compared to controlled captures, providing robust property
estimations even with fewer views in real-world scenarios.
Ссылки и действия
Дополнительные ресурсы: