📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Representative Action Selection for Large Action Space: From Bandits to MDPs

2025-12-02

Авторы:

Quan Zhou, Shie Mannor

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study the problem of selecting a small, representative action subset from an extremely large action space shared across a family of reinforcement learning (RL) environments -- a fundamental challenge in applications like inventory management and recommendation systems, where direct learning over the entire space is intractable. Our goal is to identify a fixed subset of actions that, for every environment in the family, contains a near-optimal action, thereby enabling efficient learning withou...

ID: 2511.22104v1 cs.LG, math.OC, math.PR, stat.ML

arXiv PDF

📄 ARM-Explainer -- Explaining and improving graph neural network predictions for the maximum clique problem using node features and association rule mining

2025-12-02

Авторы:

Bharat Sharman, Elkafi Hassini

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Numerous graph neural network (GNN)-based algorithms have been proposed to solve graph-based combinatorial optimization problems (COPs), but methods to explain their predictions remain largely undeveloped. We introduce ARM-Explainer, a post-hoc, model-level explainer based on association rule mining, and demonstrate it on the predictions of the hybrid geometric scattering (HGS) GNN for the maximum clique problem (MCP), a canonical NP-hard graph-based COP. The eight most explanatory association r...

ID: 2511.22866v1 cs.LG, math.OC

arXiv PDF

📄 Independent policy gradient-based reinforcement learning for economic and reliable energy management of multi-microgrid systems

2025-11-28

Авторы:

Junkai Hu, Li Xia

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Efficiency and reliability are both crucial for energy management, especially in multi-microgrid systems (MMSs) integrating intermittent and distributed renewable energy sources. This study investigates an economic and reliable energy management problem in MMSs under a distributed scheme, where each microgrid independently updates its energy management policy in a decentralized manner to optimize the long-term system performance collaboratively. We introduce the mean and variance of the exchange...

ID: 2511.20977v1 eess.SY, cs.LG, math.OC

arXiv PDF

📄 Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Optimization

2025-11-28

Авторы:

William De Deyn, Michael Herty, Giovanni Samaey

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study two-layer neural networks and train these with a particle-based method called consensus-based optimization (CBO). We compare the performance of CBO against Adam on two test cases and demonstrate how a hybrid approach, combining CBO with Adam, provides faster convergence than CBO. In the context of multi-task learning, we recast CBO into a formulation that offers less memory overhead. The CBO method allows for a mean-field limit formulation, which we couple with the mean-field limit of t...

ID: 2511.21466v1 cs.LG, math.OC

arXiv PDF

📄 Lower Complexity Bounds for Nonconvex-Strongly-Convex Bilevel Optimization with First-Order Oracles

2025-11-27

Авторы:

Kaiyi Ji

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Although upper bound guarantees for bilevel optimization have been widely studied, progress on lower bounds has been limited due to the complexity of the bilevel structure. In this work, we focus on the smooth nonconvex-strongly-convex setting and develop new hard instances that yield nontrivial lower bounds under deterministic and stochastic first-order oracle models. In the deterministic case, we prove that any first-order zero-respecting algorithm requires at least $Ω(κ^{3/2}ε^{-2})$ oracle c...

ID: 2511.19656v2 cs.LG, math.OC, stat.ML

arXiv PDF

📄 Adaptivity and Universality: Problem-dependent Universal Regret for Online Convex Optimization

2025-11-27

Авторы:

Peng Zhao, Yu-Hu Yan, Hang Yu, Zhi-Hua Zhou

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Universal online learning aims to achieve optimal regret guarantees without requiring prior knowledge of the curvature of online functions. Existing methods have established minimax-optimal regret bounds for universal online learning, where a single algorithm can simultaneously attain $\mathcal{O}(\sqrt{T})$ regret for convex functions, $\mathcal{O}(d \log T)$ for exp-concave functions, and $\mathcal{O}(\log T)$ for strongly convex functions, where $T$ is the number of rounds and $d$ is the dime...

ID: 2511.19937v1 cs.LG, math.OC, stat.ML

arXiv PDF

📄 Sparse Polyak with optimal thresholding operators for high-dimensional M-estimation

2025-11-26

Авторы:

Tianqi Qiao, Marie Maros

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We propose and analyze a variant of Sparse Polyak for high dimensional M-estimation problems. Sparse Polyak proposes a novel adaptive step-size rule tailored to suitably estimate the problem's curvature in the high-dimensional setting, guaranteeing that the algorithm's performance does not deteriorate when the ambient dimension increases. However, convergence guarantees can only be obtained by sacrificing solution sparsity and statistical accuracy. In this work, we introduce a variant of Sparse ...

ID: 2511.18167v1 stat.ML, cs.LG, math.OC

arXiv PDF

📄 Tail Distribution of Regret in Optimistic Reinforcement Learning

2025-11-26

Авторы:

Sajad Khodadadian, Mehrdad Moharrami

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We derive instance-dependent tail bounds for the regret of optimism-based reinforcement learning in finite-horizon tabular Markov decision processes with unknown transition dynamics. Focusing on a UCBVI-type algorithm, we characterize the tail distribution of the cumulative regret $R_K$ over $K$ episodes, rather than only its expectation or a single high-probability quantile. We analyze two natural exploration-bonus schedules: (i) a $K$-dependent scheme that explicitly incorporates the total num...

ID: 2511.18247v1 cs.LG, math.OC

arXiv PDF

📄 A Fair OR-ML Framework for Resource Substitution in Large-Scale Networks

2025-11-26

Авторы:

Ved Mohan, El Mehdi Er Raqabi, Pascal Van Hentenryck

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Ensuring that the right resource is available at the right location and time remains a major challenge for organizations operating large-scale logistics networks. The challenge comes from uneven demand patterns and the resulting asymmetric flow of resources across the arcs, which create persistent imbalances at the network nodes. Resource substitution among multiple, potentially composite and interchangeable, resource types is a cost-effective way to mitigate these imbalances. This leads to the ...

ID: 2511.18269v1 cs.LG, math.OC

arXiv PDF

📄 A Framework for Adaptive Stabilisation of Nonlinear Stochastic Systems

2025-11-25

Авторы:

Seth Siriya, Jingge Zhu, Dragan Nešić, Ye Pu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We consider the adaptive control problem for discrete-time, nonlinear stochastic systems with linearly parameterised uncertainty. Assuming access to a parameterised family of controllers that can stabilise the system in a bounded set within an informative region of the state space when the parameter is well-chosen, we propose a certainty equivalence learning-based adaptive control strategy, and subsequently derive stability bounds on the closed-loop system that hold for some probabilities. We th...

ID: 2511.17436v1 eess.SY, cs.LG, math.OC

arXiv PDF

Показано 11 - 20 из 157 записей