Hurdle-IMDL: An Imbalanced Learning Framework for Infrared Rainfall Retrieval
2510.20486v1
cs.LG, cs.AI, physics.ao-ph, physics.geo-ph
2025-10-25
Авторы:
Fangjian Zhang, Xiaoyong Zhuge, Wenlan Wang, Haixia Xiao, Yuying Zhu, Siyang Cheng
Abstract
Artificial intelligence has advanced quantitative remote sensing, yet its
effectiveness is constrained by imbalanced label distribution. This imbalance
leads conventionally trained models to favor common samples, which in turn
degrades retrieval performance for rare ones. Rainfall retrieval exemplifies
this issue, with performance particularly compromised for heavy rain. This
study proposes Hurdle-Inversion Model Debiasing Learning (IMDL) framework.
Following a divide-and-conquer strategy, imbalance in the rain distribution is
decomposed into two components: zero inflation, defined by the predominance of
non-rain samples; and long tail, defined by the disproportionate abundance of
light-rain samples relative to heavy-rain samples. A hurdle model is adopted to
handle the zero inflation, while IMDL is proposed to address the long tail by
transforming the learning object into an unbiased ideal inverse model.
Comprehensive evaluation via statistical metrics and case studies investigating
rainy weather in eastern China confirms Hurdle-IMDL's superiority over
conventional, cost-sensitive, generative, and multi-task learning methods. Its
key advancements include effective mitigation of systematic underestimation and
a marked improvement in the retrieval of heavy-to-extreme rain. IMDL offers a
generalizable approach for addressing imbalance in distributions of
environmental variables, enabling enhanced retrieval of rare yet high-impact
events.