Layer Probing Improves Kinase Functional Prediction with Protein Language Models

2512.00376v1 q-bio.QM, cs.AI, cs.LG 2025-12-02

Авторы:

Ajit Kumar, IndraPrakash Jha

Abstract

Protein language models (PLMs) have transformed sequence-based protein analysis, yet most applications rely only on final-layer embeddings, which may overlook biologically meaningful information encoded in earlier layers. We systematically evaluate all 33 layers of ESM-2 for kinase functional prediction using both unsupervised clustering and supervised classification. We show that mid-to-late transformer layers (layers 20-33) outperform the final layer by 32 percent in unsupervised Adjusted Rand Index and improve homology-aware supervised accuracy to 75.7 percent. Domain-level extraction, calibrated probability estimates, and a reproducible benchmarking pipeline further strengthen reliability. Our results demonstrate that transformer depth contains functionally distinct biological signals and that principled layer selection significantly improves kinase function prediction.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Layer Probing Improves Kinase Functional Prediction with Protein Language Models

Авторы:

Abstract

Ссылки и действия

Связанные статьи

XAI-Driven Deep Learning for Protein Sequence Functional Group Classification

MAT-MPNN: A Mobility-Aware Transformer-MPNN Model for Dynamic Spatiotemporal Pre...

Prostate-VarBench: A Benchmark with Interpretable TabNet Framework for Prostate ...

Climbing the label tree: Hierarchy-preserving contrastive learning for medical i...

Interpretable RNA-Seq Clustering with an LLM-Based Agentic Evidence-Grounded Fra...

Навигация