Exploratory Analysis of Cyberattack Patterns on E-Commerce Platforms Using Statistical Methods
2511.03020v1
cs.CR, cs.AI, cs.LG, 68M25, 68T05 68M25, 68T05, C.2.0; K.6.5; I.2.6; C.2.0; K.6.5
2025-11-08
Авторы:
Fatimo Adenike Adeniya
Abstract
Cyberattacks on e-commerce platforms have grown in sophistication,
threatening consumer trust and operational continuity. This research presents a
hybrid analytical framework that integrates statistical modelling and machine
learning for detecting and forecasting cyberattack patterns in the e-commerce
domain. Using the Verizon Community Data Breach (VCDB) dataset, the study
applies Auto ARIMA for temporal forecasting and significance testing, including
a Mann-Whitney U test (U = 2579981.5, p = 0.0121), which confirmed that holiday
shopping events experienced significantly more severe cyberattacks than
non-holiday periods. ANOVA was also used to examine seasonal variation in
threat severity, while ensemble machine learning models (XGBoost, LightGBM, and
CatBoost) were employed for predictive classification. Results reveal recurrent
attack spikes during high-risk periods such as Black Friday and holiday
seasons, with breaches involving Personally Identifiable Information (PII)
exhibiting elevated threat indicators. Among the models, CatBoost achieved the
highest performance (accuracy = 85.29%, F1 score = 0.2254, ROC AUC = 0.8247).
The framework uniquely combines seasonal forecasting with interpretable
ensemble learning, enabling temporal risk anticipation and breach-type
classification. Ethical considerations, including responsible use of sensitive
data and bias assessment, were incorporated. Despite class imbalance and
reliance on historical data, the study provides insights for proactive
cybersecurity resource allocation and outlines directions for future real-time
threat detection research.