📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Chronos-2: From Univariate to Universal Forecasting

2025-10-21

Авторы:

Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guerron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, Michael Bohlke-Schneider

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Pretrained time series models have enabled inference-only forecasting systems that produce accurate predictions without task-specific training. However, existing approaches largely focus on univariate forecasting, limiting their applicability in real-world scenarios where multivariate data and covariates play a crucial role. We present Chronos-2, a pretrained model capable of handling univariate, multivariate, and covariate-informed forecasting tasks in a zero-shot manner. Chronos-2 employs a gr...

ID: 2510.15821v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Policy Regularized Distributionally Robust Markov Decision Processes with Linear Function Approximation

2025-10-18

Авторы:

Jingwen Gu, Yiting He, Zhishuai Liu, Pan Xu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Decision-making under distribution shift is a central challenge in reinforcement learning (RL), where training and deployment environments differ. We study this problem through the lens of robust Markov decision processes (RMDPs), which optimize performance against adversarial transition dynamics. Our focus is the online setting, where the agent has only limited interaction with the environment, making sample efficiency and exploration especially critical. Policy optimization, despite its succes...

ID: 2510.14246v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 PAC-Bayesian Reinforcement Learning Trains Generalizable Policies

2025-10-16

Авторы:

Abdelkrim Zitouni, Mehdi Hennequin, Juba Agoun, Ryan Horache, Nadia Kabachi, Omar Rivasplata

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We derive a novel PAC-Bayesian generalization bound for reinforcement learning that explicitly accounts for Markov dependencies in the data, through the chain's mixing time. This contributes to overcoming challenges in obtaining generalization guarantees for reinforcement learning, where the sequential nature of data breaks the independence assumptions underlying classical bounds. Our bound provides non-vacuous certificates for modern off-policy algorithms like Soft Actor-Critic. We demonstrate ...

ID: 2510.10544v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Provable Anytime Ensemble Sampling Algorithms in Nonlinear Contextual Bandits

2025-10-16

Авторы:

Jiazheng Sun, Weixin Wang, Pan Xu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We provide a unified algorithmic framework for ensemble sampling in nonlinear contextual bandits and develop corresponding regret bounds for two most common nonlinear contextual bandit settings: Generalized Linear Ensemble Sampling (\texttt{GLM-ES}) for generalized linear bandits and Neural Ensemble Sampling (\texttt{Neural-ES}) for neural contextual bandits. Both methods maintain multiple estimators for the reward model parameters via maximum likelihood estimation on randomly perturbed data. We...

ID: 2510.10730v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Discrete State Diffusion Models: A Sample Complexity Perspective

2025-10-16

Авторы:

Aadithya Srikanth, Mudit Gaur, Vaneet Aggarwal

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Diffusion models have demonstrated remarkable performance in generating high-dimensional samples across domains such as vision, language, and the sciences. Although continuous-state diffusion models have been extensively studied both empirically and theoretically, discrete-state diffusion models, essential for applications involving text, sequences, and combinatorial structures, remain significantly less understood from a theoretical standpoint. In particular, all existing analyses of discrete-s...

ID: 2510.10854v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Understanding the Generalization of Stochastic Gradient Adam in Learning Neural Networks

2025-10-15

Авторы:

Xuan Tang, Han Zhang, Yuan Cao, Difan Zou

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Adam is a popular and widely used adaptive gradient method in deep learning, which has also received tremendous focus in theoretical research. However, most existing theoretical work primarily analyzes its full-batch version, which differs fundamentally from the stochastic variant used in practice. Unlike SGD, stochastic Adam does not converge to its full-batch counterpart even with infinitesimal learning rates. We present the first theoretical characterization of how batch size affects Adam's g...

ID: 2510.11354v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Counterfactual Identifiability via Dynamic Optimal Transport

2025-10-11

Авторы:

Fabio De Sousa Ribeiro, Ainkaran Santhirasekaram, Ben Glocker

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We address the open question of counterfactual identification for high-dimensional multivariate outcomes from observational data. Pearl (2000) argues that counterfactuals must be identifiable (i.e., recoverable from the observed data distribution) to justify causal claims. A recent line of work on counterfactual inference shows promising results but lacks identification, undermining the causal validity of its estimates. To address this, we establish a foundation for multivariate counterfactual i...

ID: 2510.08294v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 ClauseLens: Clause-Grounded, CVaR-Constrained Reinforcement Learning for Trustworthy Reinsurance Pricing

2025-10-11

Авторы:

Stella C. Dong, James R. Finlay

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Reinsurance treaty pricing must satisfy stringent regulatory standards, yet current quoting practices remain opaque and difficult to audit. We introduce ClauseLens, a clause-grounded reinforcement learning framework that produces transparent, regulation-compliant, and risk-aware treaty quotes. ClauseLens models the quoting task as a Risk-Aware Constrained Markov Decision Process (RA-CMDP). Statutory and policy clauses are retrieved from legal and underwriting corpora, embedded into the agent's...

ID: 2510.08429v1 cs.LG, cs.AI, stat.ML, 68T05, 91G70, I.2.6; I.2.7; J.1

arXiv PDF

📄 gLSTM: Mitigating Over-Squashing by Increasing Storage Capacity

2025-10-11

Авторы:

Hugh Blayney, Álvaro Arroyo, Xiaowen Dong, Michael M. Bronstein

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Graph Neural Networks (GNNs) leverage the graph structure to transmit information between nodes, typically through the message-passing mechanism. While these models have found a wide variety of applications, they are known to suffer from over-squashing, where information from a large receptive field of node representations is collapsed into a single fixed sized vector, resulting in an information bottleneck. In this paper, we re-examine the over-squashing phenomenon through the lens of model sto...

ID: 2510.08450v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Scalable Policy-Based RL Algorithms for POMDPs

2025-10-10

Авторы:

Ameya Anjarlekar, Rasoul Etesami, R Srikant

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The continuous nature of belief states in POMDPs presents significant computational challenges in learning the optimal policy. In this paper, we consider an approach that solves a Partially Observable Reinforcement Learning (PORL) problem by approximating the corresponding POMDP model into a finite-state Markov Decision Process (MDP) (called Superstate MDP). We first derive theoretical guarantees that improve upon prior work that relate the optimal value function of the transformed Superstate MD...

ID: 2510.06540v1 cs.LG, cs.AI, stat.ML

arXiv PDF

Показано 41 - 50 из 102 записей