APP: Accelerated Path Patching with Task-Specific Pruning
2511.05442v1
cs.LG, cs.AI, cs.CL, 68Uxx, I.2.7; I.2.6; I.2.m
2025-11-11
Авторы:
Frauke Andersen, William Rudman, Ruochen Zhang, Carsten Eickhoff
Abstract
Circuit discovery is a key step in many mechanistic interpretability
pipelines. Current methods, such as Path Patching, are computationally
expensive and have limited in-depth circuit analysis for smaller models. In
this study, we propose Accelerated Path Patching (APP), a hybrid approach
leveraging our novel contrastive attention head pruning method to drastically
reduce the search space of circuit discovery methods. Our Contrastive-FLAP
pruning algorithm uses techniques from causal mediation analysis to assign
higher pruning scores to task-specific attention heads, leading to higher
performing sparse models compared to traditional pruning techniques. Although
Contrastive-FLAP is successful at preserving task-specific heads that existing
pruning algorithms remove at low sparsity ratios, the circuits found by
Contrastive-FLAP alone are too large to satisfy the minimality constraint
required in circuit analysis. APP first applies Contrastive-FLAP to reduce the
search space on required for circuit discovery algorithms by, on average, 56\%.
Next, APP, applies traditional Path Patching on the remaining attention heads,
leading to a speed up of 59.63\%-93.27\% compared to Path Patching applied to
the dense model. Despite the substantial computational saving that APP
provides, circuits obtained from APP exhibit substantial overlap and similar
performance to previously established Path Patching circuits