📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Safeguarded Stochastic Polyak Step Sizes for Non-smooth Optimization: Robust Performance Without Small (Sub)Gradients

2025-12-04

Авторы:

Dimitris Oikonomou, Nicolas Loizou

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The stochastic Polyak step size (SPS) has proven to be a promising choice for stochastic gradient descent (SGD), delivering competitive performance relative to state-of-the-art methods on smooth convex and non-convex optimization problems, including deep neural network training. However, extensions of this approach to non-smooth settings remain in their early stages, often relying on interpolation assumptions or requiring knowledge of the optimal solution. In this work, we propose a novel SPS va...

ID: 2512.02342v1 math.OC, cs.LG, stat.ML

arXiv PDF

📄 Efficient Penalty-Based Bilevel Methods: Improved Analysis, Novel Updates, and Flatness Condition

2025-11-25

Авторы:

Liuyuan Jiang, Quan Xiao, Lisha Chen, Tianyi Chen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Penalty-based methods have become popular for solving bilevel optimization (BLO) problems, thanks to their effective first-order nature. However, they often require inner-loop iterations to solve the lower-level (LL) problem and small outer-loop step sizes to handle the increased smoothness induced by large penalty terms, leading to suboptimal complexity. This work considers the general BLO problems with coupled constraints (CCs) and leverages a novel penalty reformulation that decouples the upp...

ID: 2511.16796v1 math.OC, cs.LG, stat.ML

arXiv PDF

📄 DIGing--SGLD: Decentralized and Scalable Langevin Sampling over Time--Varying Networks

2025-11-19

Авторы:

Waheed U. Bajwa, Mert Gurbuzbalaban, Mustafa Ali Kutbay, Lingjiong Zhu, Muhammad Zulqarnain

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Sampling from a target distribution induced by training data is central to Bayesian learning, with Stochastic Gradient Langevin Dynamics (SGLD) serving as a key tool for scalable posterior sampling and decentralized variants enabling learning when data are distributed across a network of agents. This paper introduces DIGing-SGLD, a decentralized SGLD algorithm designed for scalable Bayesian learning in multi-agent systems operating over time-varying networks. Existing decentralized SGLD methods ...

ID: 2511.12836v1 math.OC, cs.LG, stat.ML

arXiv PDF

📄 Global Convergence of Four-Layer Matrix Factorization under Random Initialization

2025-11-15

Авторы:

Minrui Luo, Weihang Xu, Xiang Gao, Maryam Fazel, Simon Shaolei Du

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Gradient descent dynamics on the deep matrix factorization problem is extensively studied as a simplified theoretical model for deep neural networks. Although the convergence theory for two-layer matrix factorization is well-established, no global convergence guarantee for general deep matrix factorization under random initialization has been established to date. To address this gap, we provide a polynomial-time global convergence guarantee for randomly initialized gradient descent on four-layer...

ID: 2511.09925v1 math.OC, cs.LG, stat.ML

arXiv PDF

📄 A Support-Set Algorithm for Optimization Problems with Nonnegative and Orthogonal Constraints

2025-11-07

Авторы:

Lei Wang, Xin Liu, Xiaojun Chen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In this paper, we investigate optimization problems with nonnegative and orthogonal constraints, where any feasible matrix of size $n \times p$ exhibits a sparsity pattern such that each row accommodates at most one nonzero entry. Our analysis demonstrates that, by fixing the support set, the global solution of the minimization subproblem for the proximal linearization of the objective function can be computed in closed form with at most $n$ nonzero entries. Exploiting this structural property o...

ID: 2511.03443v1 math.OC, cs.LG, stat.ML

arXiv PDF

📄 Problem-Parameter-Free Decentralized Bilevel Optimization

2025-10-30

Авторы:

Zhiwei Zhai, Wenjing Yan, Ying-Jun Angela Zhang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Decentralized bilevel optimization has garnered significant attention due to its critical role in solving large-scale machine learning problems. However, existing methods often rely on prior knowledge of problem parameters-such as smoothness, convexity, or communication network topologies-to determine appropriate stepsizes. In practice, these problem parameters are typically unavailable, leading to substantial manual effort for hyperparameter tuning. In this paper, we propose AdaSDBO, a fully pr...

ID: 2510.24288v1 math.OC, cs.LG, stat.ML

arXiv PDF

📄 Endogenous Aggregation of Multiple Data Envelopment Analysis Scores for Large Data Sets

2025-10-25

Авторы:

Hashem Omrani, Raha Imanirad, Adam Diamant, Utkarsh Verma, Amol Verma, Fahad Razak

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We propose an approach for dynamic efficiency evaluation across multiple organizational dimensions using data envelopment analysis (DEA). The method generates both dimension-specific and aggregate efficiency scores, incorporates desirable and undesirable outputs, and is suitable for large-scale problem settings. Two regularized DEA models are introduced: a slack-based measure (SBM) and a linearized version of a nonlinear goal programming model (GP-SBM). While SBM estimates an aggregate efficienc...

ID: 2510.20052v1 math.OC, cs.LG, stat.ML

arXiv PDF

📄 Progressively Sampled Equality-Constrained Optimization

2025-10-04

Авторы:

Frank E. Curtis, Lingjun Guo, Daniel P. Robinson

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

An algorithm is proposed, analyzed, and tested for solving continuous nonlinear-equality-constrained optimization problems where the constraints are defined by an expectation or an average over a large (finite) number of terms. The main idea of the algorithm is to solve a sequence of equality-constrained problems, each involving a finite sample of constraint-function terms, over which the sample set grows progressively. Under assumptions about the constraint functions and their first- and second...

ID: 2510.00417v1 math.OC, cs.LG, stat.ML

arXiv PDF

📄 Non-Euclidean Broximal Point Method: A Blueprint for Geometry-Aware Optimization

2025-10-04

Авторы:

Kaja Gruntkowska, Peter Richtárik

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The recently proposed Broximal Point Method (BPM) [Gruntkowska et al., 2025] offers an idealized optimization framework based on iteratively minimizing the objective function over norm balls centered at the current iterate. It enjoys striking global convergence guarantees, converging linearly and in a finite number of steps for proper, closed and convex functions. However, its theoretical analysis has so far been confined to the Euclidean geometry. At the same time, emerging trends in deep learn...

ID: 2510.00823v1 math.OC, cs.LG, stat.ML

arXiv PDF

📄 Overfitting in Adaptive Robust Optimization

2025-09-24

Авторы:

Karl Zhu, Dimitris Bertsimas

#### Контекст Adaptive robust optimization (ARO) — это расширение статического robust optimization, позволяющее решениям зависеть от реализованного неопределенности. Это делает его более гибким, чем статические модели, так как решения могут адаптироваться к изменениям в данных. Однако это гибкость сопровождается уязвимостью: решения могут стать "пронизываемыми", если реализации выходят за пределы исходного неопределенности. Это поведение аналогично overfitting в машинном обучении, где модель слишком тесно привязана к тренировочным данным и становится неспособной общероботку. Такая уязвимость ARO может привести к невозможности решения задачи в некоторых случаях. Это мотивирует разработку методов, которые уменьшат чувствительность ARO к выходу из наблюдаемого диапазона неопределенности. #### Метод Метод предполагает присвоение конкретности для каждого ограничения в модели, с учетом пропорциональной уровня трудности и важности этого ограничения. Это подход подход можно проанализировать как вид регуляризации, которая помогает оптимизировать гибкость решения. Задача заключается в том, чтобы найти баланс между гибкостью и стабильностью, чтобы исключить чрезмерную чувствительность к изменениям в неопределенности. Для этого предлагается использовать неопределенность произвольного размера, с учетом того, насколько быстро неопределенность может изменяться. Такой подход позволяет адаптировать модель к различным уровням неопределенности, стремясь оптимизировать ее как под нагрузкой, так и в условиях безграничных возможностей. #### Результаты Эксперименты проводятся на наборе данных, где проводится сравнение ARO при различных значениях неопределенности с традиционными моделями. Результаты показали, что применение специфических размеров неопределенности может значительно улучшить стабильность решения, минимизируя риск выхода за границы неопределенности. Метод позволил добиться более устойчивых и эффективных решений в условиях высокой неопределенности, чем стандартные модели. #### Значимость Полученные методы могут быть применены в различных областях, где существуют условия высокой неопределенности, такие как оптимизация цепочек поставок, финансовая планирования и информационные системы. Они обладают рядом преимуществ, включая уменьшение риска ошибок, улучшение моделирования динамических систем и повышение уровня точности решений. Эти достижения могут иметь значительное влияние на развитие методов оптимизации в технических, экономических и технологических системах. #### Выводы В итоге, предложенный подход позволяет улучшить стабильность и эффективность ARO, уменьшая риск overfitting в решениях. Дальнейшие

Annotation:

Adaptive robust optimization (ARO) extends static robust optimization by allowing decisions to depend on the realized uncertainty - weakly dominating static solutions within the modeled uncertainty set. However, ARO makes previous constraints that were independent of uncertainty now dependent, making it vulnerable to additional infeasibilities when realizations fall outside the uncertainty set. This phenomenon of adaptive policies being brittle is analogous to overfitting in machine learning. To...

ID: 2509.16451v1 math.OC, cs.LG, stat.ML

arXiv PDF

Показано 1 - 10 из 15 записей