📊 Статистика дайджестов

Всего дайджестов: 34123 Добавлено сегодня: 101

Последнее обновление: сегодня

📄 Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning

2025-11-21

Авторы:

Yuxuan Gu, Weimin Bai, Yifei Wang, Weijian Luo, He Sun

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Masked auto-regressive diffusion models (MAR) benefit from the expressive modeling ability of diffusion models and the flexibility of masked auto-regressive ordering. However, vanilla MAR suffers from slow inference due to its hierarchical inference mechanism: an outer AR unmasking loop and an inner diffusion denoising chain. Such decoupled structure not only harm the generation efficiency but also hinder the practical use of MAR for reinforcement learning (RL), an increasingly critical paradigm...

ID: 2511.15190v1 cs.LG, cs.AI

arXiv PDF

📄 EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

2025-11-21

Авторы:

Kai Yang, Xin Xu, Yangkun Chen, Weijie Liu, Jiafei Lyu, Zichuan Lin, Deheng Ye, Saiyong Yang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Long-term training of large language models (LLMs) requires maintaining stable exploration to prevent the model from collapsing into sub-optimal behaviors. Entropy is crucial in this context, as it controls exploration and helps avoid premature convergence to sub-optimal solutions. However, existing reinforcement learning methods struggle to maintain an appropriate level of entropy, as the training process involves a mix of positive and negative samples, each affecting entropy in different ways ...

ID: 2511.15248v1 cs.LG, cs.AI

arXiv PDF

📄 STREAM-VAE: Dual-Path Routing for Slow and Fast Dynamics in Vehicle Telemetry Anomaly Detection

2025-11-21

Авторы:

Kadir-Kaan Özer, René Ebeling, Markus Enzweiler

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Automotive telemetry data exhibits slow drifts and fast spikes, often within the same sequence, making reliable anomaly detection challenging. Standard reconstruction-based methods, including sequence variational autoencoders (VAEs), use a single latent process and therefore mix heterogeneous time scales, which can smooth out spikes or inflate variances and weaken anomaly separation. In this paper, we present STREAM-VAE, a variational autoencoder for anomaly detection in automotive telemetry t...

ID: 2511.15339v1 cs.LG, cs.AI

arXiv PDF

📄 Parameter Importance-Driven Continual Learning for Foundation Models

2025-11-21

Авторы:

Lingxiang Wang, Hainan Zhang, Zhiming Zheng

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Domain-specific post-training often causes catastrophic forgetting, making foundation models lose their general reasoning ability and limiting their adaptability to dynamic real-world environments. Preserving general capabilities while acquiring downstream domain knowledge is a central challenge for large language and multimodal models. Traditional continual learning methods, such as regularization, replay and architectural isolation, suffer from poor downstream performance, reliance on inaccess...

ID: 2511.15375v1 cs.LG, cs.AI

arXiv PDF

📄 Towards Understanding Layer Contributions in Tabular In-Context Learning Models

2025-11-21

Авторы:

Amir Rezaei Balef, Mykhailo Koshil, Katharina Eggensperger

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Despite the architectural similarities between tabular in-context learning (ICL) models and large language models (LLMs), little is known about how individual layers contribute to tabular prediction. In this paper, we investigate how the latent spaces evolve across layers in tabular ICL models, identify potential redundant layers, and compare these dynamics with those observed in LLMs. We analyze TabPFN and TabICL through the "layers as painters" perspective, finding that only subsets of layers ...

ID: 2511.15432v1 cs.LG, cs.AI

arXiv PDF

📄 TSFM in-context learning for time-series classification of bearing-health status

2025-11-21

Авторы:

Michel Tokic, Slobodan Djukanović, Anja von Beuningen, Cheng Feng

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper introduces a classification method using in-context learning in time-series foundation models (TSFM). We show how data, which was not part of the TSFM training data corpus, can be classified without the need of finetuning the model. Examples are represented in the form of targets (class id) and covariates (data matrix) within the prompt of the model, which enables to classify an unknown covariate data pattern alongside the forecast axis through in-context learning. We apply this metho...

ID: 2511.15447v1 cs.LG, cs.AI

arXiv PDF

📄 Continual Reinforcement Learning for Cyber-Physical Systems: Lessons Learned and Open Challenges

2025-11-21

Авторы:

Kim N. Nolle, Ivana Dusparic, Rhodri Cusack, Vinny Cahill

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Continual learning (CL) is a branch of machine learning that aims to enable agents to adapt and generalise previously learned abilities so that these can be reapplied to new tasks or environments. This is particularly useful in multi-task settings or in non-stationary environments, where the dynamics can change over time. This is particularly relevant in cyber-physical systems such as autonomous driving. However, despite recent advances in CL, successfully applying it to reinforcement learning (...

ID: 2511.15652v1 cs.LG, cs.AI

arXiv PDF

📄 Walrus: A Cross-Domain Foundation Model for Continuum Dynamics

2025-11-21

Авторы:

Michael McCabe, Payel Mukhopadhyay, Tanya Marwah, Bruno Regaldo-Saint Blancard, Francois Rozet, Cristiana Diaconu, Lucas Meyer, Kaze W. K. Wong, Hadi Sotoudeh, Alberto Bietti, Irina Espejo, Rio Fear, Siavash Golkar, Tom Hehir, Keiya Hirashima, Geraud Krawezik, Francois Lanusse, Rudy Morel, Ruben Ohana, Liam Parker, Mariel Pettee, Jeff Shen, Kyunghyun Cho, Miles Cranmer, Shirley Ho

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Foundation models have transformed machine learning for language and vision, but achieving comparable impact in physical simulation remains a challenge. Data heterogeneity and unstable long-term dynamics inhibit learning from sufficiently diverse dynamics, while varying resolutions and dimensionalities challenge efficient training on modern hardware. Through empirical and theoretical analysis, we incorporate new approaches to mitigate these obstacles, including a harmonic-analysis-based stabiliz...

ID: 2511.15684v1 cs.LG, cs.AI, cs.CE

arXiv PDF

📄 FLARE: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated Learning

2025-11-20

Авторы:

Abolfazl Younesi, Leon Kiss, Zahra Najafabadi Samani, Juan Aznar Poveda, Thomas Fahringer

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Federated learning (FL) enables collaborative model training while preserving data privacy. However, it remains vulnerable to malicious clients who compromise model integrity through Byzantine attacks, data poisoning, or adaptive adversarial behaviors. Existing defense mechanisms rely on static thresholds and binary classification, failing to adapt to evolving client behaviors in real-world deployments. We propose FLARE, an adaptive reputation-based framework that transforms client reliability a...

ID: 2511.14715v2 cs.LG, cs.AI, cs.CR, cs.DC, cs.MA

arXiv PDF

📄 Empowering Multi-Turn Tool-Integrated Reasoning with Group Turn Policy Optimization

2025-11-20

Авторы:

Yifeng Ding, Hung Le, Songyang Han, Kangrui Ruan, Zhenghui Jin, Varun Kumar, Zijian Wang, Anoop Deoras

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Training Large Language Models (LLMs) for multi-turn Tool-Integrated Reasoning (TIR) - where models iteratively reason, generate code, and verify through execution - remains challenging for existing reinforcement learning (RL) approaches. Current RL methods, exemplified by Group Relative Policy Optimization (GRPO), suffer from coarse-grained, trajectory-level rewards that provide insufficient learning signals for complex multi-turn interactions, leading to training stagnation. To address this is...

ID: 2511.14846v1 cs.LG, cs.AI, cs.CL

arXiv PDF

Показано 371 - 380 из 2912 записей