Adversarial Fine-tuning in Offline-to-Online Reinforcement Learning for Robust Robot Control
2510.13358v1
cs.RO, cs.AI
2025-10-17
Авторы:
Shingo Ayabe, Hiroshi Kera, Kazuhiko Kawamoto
Abstract
Offline reinforcement learning enables sample-efficient policy acquisition
without risky online interaction, yet policies trained on static datasets
remain brittle under action-space perturbations such as actuator faults. This
study introduces an offline-to-online framework that trains policies on clean
data and then performs adversarial fine-tuning, where perturbations are
injected into executed actions to induce compensatory behavior and improve
resilience. A performance-aware curriculum further adjusts the perturbation
probability during training via an exponential-moving-average signal, balancing
robustness and stability throughout the learning process. Experiments on
continuous-control locomotion tasks demonstrate that the proposed method
consistently improves robustness over offline-only baselines and converges
faster than training from scratch. Matching the fine-tuning and evaluation
conditions yields the strongest robustness to action-space perturbations, while
the adaptive curriculum strategy mitigates the degradation of nominal
performance observed with the linear curriculum strategy. Overall, the results
show that adversarial fine-tuning enables adaptive and robust control under
uncertain environments, bridging the gap between offline efficiency and online
adaptability.
Ссылки и действия
Дополнительные ресурсы: