Quaternion Approximation Networks for Enhanced Image Classification and Oriented Object Detection

2509.05512v1 cs.CV, cs.RO 2025-09-10

Авторы:

Bryce Grant, Peng Wang

Резюме на русском

## Контекст Modern image classification and object detection tasks face significant challenges due to the need for rotation equivariance and efficient computation. Existing convolutional neural networks (CNNs) often struggle with maintaining geometric invariance to rotations, while traditional quaternion neural networks (QNNs) are computationally intensive and lack practical implementation. This paper addresses these issues by introducing Quaternion Approximation Networks (QUAN), a novel framework that combines the benefits of quaternion algebra with real-valued operations, ensuring efficient and rotation-equivariant processing of image data. ## Метод QUAN leverages quaternion algebra by approximating quaternion convolutions using Hamilton product decomposition. Instead of operating entirely in the quaternion domain, the network uses real-valued matrices to represent quaternion components. This approach ensures rotation equivariance while reducing computational overhead. Independent Quaternion Batch Normalization (IQBN) is introduced to stabilize training by addressing the unique challenges of quaternion-based layers. Additionally, spatial attention mechanisms are extended to quaternion operations, enhancing the model's ability to focus on relevant features. The framework is implemented with custom CUDA kernels to achieve high performance on modern hardware. ## Результаты QUAN is evaluated on standard benchmarks, including CIFAR-10, CIFAR-100, ImageNet for classification, and COCO and DOTA for object detection. Compared to traditional CNNs and other quaternion-based models, QUAN demonstrates superior accuracy with fewer parameters and faster convergence. For object detection, it achieves state-of-the-art (SOTA) performance among quaternion CNNs, showcasing its ability to handle rotation-sensitive tasks efficiently. The model's performance is attributed to its ability to preserve geometric properties while maintaining computational efficiency. ## Значимость QUAN holds significant potential across multiple domains. In robotics, its rotation-aware perception capabilities make it ideal for tasks such as autonomous navigation and object recognition. In other fields, its efficient architecture and ability to handle complex geometric transformations provide a competitive edge over conventional models. The framework's modular design and custom CUDA kernels ensure scalability and applicability to a wide range of real-world problems, including those requiring resource-constrained systems. ## Выводы QUAN advances the state-of-the-art in quaternion neural networks by introducing a novel approximation approach that combines the benefits of quaternion algebra with real-valued operations. Its superior performance in image classification and object detection, coupled with its efficient implementation, positions it as a promising solution for rotation-equivariant tasks. Future work will focus on extending QUAN to multi-modal data fusion and integrating it into larger modular frameworks for broader real-world applications.

Abstract

This paper introduces Quaternion Approximate Networks (QUAN), a novel deep learning framework that leverages quaternion algebra for rotation equivariant image classification and object detection. Unlike conventional quaternion neural networks attempting to operate entirely in the quaternion domain, QUAN approximates quaternion convolution through Hamilton product decomposition using real-valued operations. This approach preserves geometric properties while enabling efficient implementation with custom CUDA kernels. We introduce Independent Quaternion Batch Normalization (IQBN) for training stability and extend quaternion operations to spatial attention mechanisms. QUAN is evaluated on image classification (CIFAR-10/100, ImageNet), object detection (COCO, DOTA), and robotic perception tasks. In classification tasks, QUAN achieves higher accuracy with fewer parameters and faster convergence compared to existing convolution and quaternion-based models. For objection detection, QUAN demonstrates improved parameter efficiency and rotation handling over standard Convolutional Neural Networks (CNNs) while establishing the SOTA for quaternion CNNs in this downstream task. These results highlight its potential for deployment in resource-constrained robotic systems requiring rotation-aware perception and application in other domains.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Quaternion Approximation Networks for Enhanced Image Classification and Oriented Object Detection

Авторы:

Резюме на русском

Abstract

Ссылки и действия

Связанные статьи

FASTer: Toward Efficient Autoregressive Vision Language Action Modeling via neur...

Object Reconstruction under Occlusion with Generative Priors and Contact-induced...

Image Generation as a Visual Planner for Robotic Manipulation

TrajDiff: End-to-end Autonomous Driving without Perception Annotation

SwiftVLA: Unlocking Spatiotemporal Dynamics for Lightweight VLA Models at Minima...

Навигация