Signal-Based Malware Classification Using 1D CNNs

2509.06548v1 cs.CR, cs.AI, cs.CV, cs.LG, I.2.6; K.6.5 2025-09-10
Авторы:

Jack Wilkie, Hanan Hindy, Ivan Andonovic, Christos Tachtatzis, Robert Atkinson

Резюме на русском

## Контекст Modern malware detection faces significant challenges due to the use of advanced obfuscation techniques, which can bypass traditional static analysis methods. Dynamic analysis, while effective, is resource-intensive and impractical for large-scale deployment. To address these issues, existing research transforms malware binaries into 2D images by reshaping their data into a grid format and resizing it using Lanczos resampling. These images are then analyzed using computer vision techniques, enabling detection of obfuscated malware more effectively than static analysis. However, this approach introduces significant information loss due to quantization noise and the artificial introduction of 2D dependencies, which do not exist in the original binary data. This limitation reduces the classification performance of downstream models. This study proposes a novel approach that converts malware binaries into 1D signals, eliminating the need for heuristic reshaping and avoiding quantization noise by storing data in a floating-point format. ## Метод The proposed methodology focuses on converting malware binaries into 1D signals, leveraging their inherent structure and minimizing information loss. Unlike traditional 2D image-based approaches, this method preserves the original signal's integrity by avoiding heuristic reshaping and quantization noise. The signals are processed using a bespoke 1D convolutional neural network (1D CNN) based on the ResNet architecture. The network incorporates squeeze-and-excitation layers to enhance feature representation and classification accuracy. The model was evaluated on the MalNet dataset, a comprehensive dataset for malware classification, to assess its performance across binary, type, and family-level classification tasks. This approach represents a significant departure from conventional methods, offering improved classification accuracy and robustness. ## Результаты The experiments demonstrated the efficacy of the 1D signal-based approach in malware classification. The bespoke 1D CNN achieved state-of-the-art performance on the MalNet dataset, with F1 scores of 0.874 for binary classification, 0.503 for type-level classification, and 0.507 for family-level classification. These results outperform existing 2D CNN models when applied to the same dataset, highlighting the superiority of the proposed signal-based methodology. The floating-point representation of signals eliminates quantization noise, ensuring that the models receive more accurate and complete data for analysis. This improvement in signal fidelity directly translates to better classification performance, paving the way for more effective malware detection systems. ## Значимость The proposed 1D signal-based approach offers several advantages over traditional 2D image-based methods. By avoiding heuristic reshaping and quantization noise, it preserves the integrity of the original malware data, leading to more accurate classification. The method is computationally efficient, making it suitable for large-scale deployment in real-world cybersecurity systems. Its applications extend beyond malware classification, as the signal-based modality can be applied to other domains requiring robust signal processing. The potential impact of this work includes enhanced malware detection capabilities, improved system security, and reduced resource consumption in large-scale deployment scenarios. ## Выводы The study demonstrates the effectiveness of converting malware binaries into 1D signals for classification using 1D CNNs. The bespoke 1D CNN architecture, based on ResNet and squeeze-and-excitation layers, achieves state-of-the-art performance on the MalNet dataset, outperforming existing 2D CNN models. This approach eliminates the limitations of traditional 2D image-based methods, offering superior classification accuracy and robustness. Future research directions include exploring advanced signal processing techniques to further enhance signal fidelity and investigating the applicability of the proposed methodology to other cybersecurity and signal processing tasks.

Abstract

Malware classification is a contemporary and ongoing challenge in cyber-security: modern obfuscation techniques are able to evade traditional static analysis, while dynamic analysis is too resource intensive to be deployed at a large scale. One prominent line of research addresses these limitations by converting malware binaries into 2D images by heuristically reshaping them into a 2D grid before resizing using Lanczos resampling. These images can then be classified based on their textural information using computer vision approaches. While this approach can detect obfuscated malware more effectively than static analysis, the process of converting files into 2D images results in significant information loss due to both quantisation noise, caused by rounding to integer pixel values, and the introduction of 2D dependencies which do not exist in the original data. This loss of signal limits the classification performance of the downstream model. This work addresses these weaknesses by instead resizing the files into 1D signals which avoids the need for heuristic reshaping, and additionally these signals do not suffer from quantisation noise due to being stored in a floating-point format. It is shown that existing 2D CNN architectures can be readily adapted to classify these 1D signals for improved performance. Furthermore, a bespoke 1D convolutional neural network, based on the ResNet architecture and squeeze-and-excitation layers, was developed to classify these signals and evaluated on the MalNet dataset. It was found to achieve state-of-the-art performance on binary, type, and family level classification with F1 scores of 0.874, 0.503, and 0.507, respectively, paving the way for future models to operate on the proposed signal modality.

Ссылки и действия

Связанные статьи

Signal-Based Malware Classification Using 1D CNNs

## Контекст Современные угрозы в сфере кибербезопасности, такие как малвирь, требуют эффективных методов идентификации и...

2025-09-10