Dynamic Regret Bounds for Online Omniprediction with Long Term Constraints

2510.07266v1 cs.LG, cs.GT 2025-10-10

Авторы:

Yahav Bechavod, Jiuyao Lu, Aaron Roth

Abstract

We present an algorithm guaranteeing dynamic regret bounds for online omniprediction with long term constraints. The goal in this recently introduced problem is for a learner to generate a sequence of predictions which are broadcast to a collection of downstream decision makers. Each decision maker has their own utility function, as well as a vector of constraint functions, each mapping their actions and an adversarially selected state to reward or constraint violation terms. The downstream decision makers select actions "as if" the state predictions are correct, and the goal of the learner is to produce predictions such that all downstream decision makers choose actions that give them worst-case utility guarantees while minimizing worst-case constraint violation. Within this framework, we give the first algorithm that obtains simultaneous \emph{dynamic regret} guarantees for all of the agents -- where regret for each agent is measured against a potentially changing sequence of actions across rounds of interaction, while also ensuring vanishing constraint violation for each agent. Our results do not require the agents themselves to maintain any state -- they only solve one-round constrained optimization problems defined by the prediction made at that round.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Dynamic Regret Bounds for Online Omniprediction with Long Term Constraints

Авторы:

Abstract

Ссылки и действия

Связанные статьи

The Active and Noise-Tolerant Strategic Perceptron

Breaking Algorithmic Collusion in Human-AI Ecosystems

Nash Policy Gradient: A Policy Gradient Method with Iteratively Refined Regulari...

On the Universal Near Optimality of Hedge in Combinatorial Settings

Robust Adversarial Reinforcement Learning in Stochastic Games via Sequence Model...

Навигация