Interpretable Model-Aware Counterfactual Explanations for Random Forest
2510.27397v1
stat.ML, cs.LG
2025-11-04
Авторы:
Joshua S. Harvey, Guanchao Feng, Sai Anusha Meesala, Tina Zhao, Dhagash Mehta
Abstract
Despite their enormous predictive power, machine learning models are often
unsuitable for applications in regulated industries such as finance, due to
their limited capacity to provide explanations. While model-agnostic frameworks
such as Shapley values have proved to be convenient and popular, they rarely
align with the kinds of causal explanations that are typically sought after.
Counterfactual case-based explanations, where an individual is informed of
which circumstances would need to be different to cause a change in outcome,
may be more intuitive and actionable. However, finding appropriate
counterfactual cases is an open challenge, as is interpreting which features
are most critical for the change in outcome. Here, we pose the question of
counterfactual search and interpretation in terms of similarity learning,
exploiting the representation learned by the random forest predictive model
itself. Once a counterfactual is found, the feature importance of the
explanation is computed as a function of which random forest partitions are
crossed in order to reach it from the original instance. We demonstrate this
method on both the MNIST hand-drawn digit dataset and the German credit
dataset, finding that it generates explanations that are sparser and more
useful than Shapley values.
Ссылки и действия
Дополнительные ресурсы: