WhaleVAD-BPN: Improving Baleen Whale Call Detection with Boundary Proposal Networks and Post-processing Optimisation
2510.21280v1
eess.AS, cs.AI, cs.LG, cs.SD, q-bio.QM
2025-10-28
Авторы:
Christiaan M. Geldenhuys, Günther Tonitz, Thomas R. Niesler
Abstract
While recent sound event detection (SED) systems can identify baleen whale
calls in marine audio, challenges related to false positive and minority-class
detection persist. We propose the boundary proposal network (BPN), which
extends an existing lightweight SED system. The BPN is inspired by work in
image object detection and aims to reduce the number of false positive
detections. It achieves this by using intermediate latent representations
computed within the backbone classification model to gate the final output.
When added to an existing SED system, the BPN achieves a 16.8 % absolute
increase in precision, as well as 21.3 % and 9.4 % improvements in the F1-score
for minority-class d-calls and bp-calls, respectively. We further consider two
approaches to the selection of post-processing hyperparameters: a
forward-search and a backward-search. By separately optimising event-level and
frame-level hyperparameters, these two approaches lead to considerable
performance improvements over parameters selected using empirical methods. The
complete WhaleVAD-BPN system achieves a cross-validated development F1-score of
0.475, which is a 9.8 % absolute improvement over the baseline.