A Standardized Benchmark for Multilabel Antimicrobial Peptide Classification
2511.04814v1
cs.LG, cs.AI, q-bio.BM, 68T07, 62H30, 62P10, I.2.6; I.2.1; I.5.1; I.5.2
2025-11-11
Авторы:
Sebastian Ojeda, Rafael Velasquez, Nicolás Aparicio, Juanita Puentes, Paula Cárdenas, Nicolás Andrade, Gabriel González, Sergio Rincón, Carolina Muñoz-Camargo, Pablo Arbeláez
Abstract
Antimicrobial peptides have emerged as promising molecules to combat
antimicrobial resistance. However, fragmented datasets, inconsistent
annotations, and the lack of standardized benchmarks hinder computational
approaches and slow down the discovery of new candidates. To address these
challenges, we present the Expanded Standardized Collection for Antimicrobial
Peptide Evaluation (ESCAPE), an experimental framework integrating over 80.000
peptides from 27 validated repositories. Our dataset separates antimicrobial
peptides from negative sequences and incorporates their functional annotations
into a biologically coherent multilabel hierarchy, capturing activities across
antibacterial, antifungal, antiviral, and antiparasitic classes. Building on
ESCAPE, we propose a transformer-based model that leverages sequence and
structural information to predict multiple functional activities of peptides.
Our method achieves up to a 2.56% relative average improvement in mean Average
Precision over the second-best method adapted for this task, establishing a new
state-of-the-art multilabel peptide classification. ESCAPE provides a
comprehensive and reproducible evaluation framework to advance AI-driven
antimicrobial peptide research.