📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 0

Последнее обновление: сегодня

📄 Reward Engineering for Spatial Epidemic Simulations: A Reinforcement Learning Platform for Individual Behavioral Learning

2025-11-26

Авторы:

Radman Rakhshandehroo, Daniel Coombs

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We present ContagionRL, a Gymnasium-compatible reinforcement learning platform specifically designed for systematic reward engineering in spatial epidemic simulations. Unlike traditional agent-based models that rely on fixed behavioral rules, our platform enables rigorous evaluation of how reward function design affects learned survival strategies across diverse epidemic scenarios. ContagionRL integrates a spatial SIRS+D epidemiological model with configurable environmental parameters, allowing ...

ID: 2511.18000v1 cs.LG, cs.AI, q-bio.PE

arXiv PDF

📄 The Alignment Paradox of Medical Large Language Models in Infertility Care: Decoupling Algorithmic Improvement from Clinical Decision-making Quality

2025-11-26

Авторы:

Dou Liu, Ying Long, Sophia Zuoqiu, Kaipeng Xie, Runze Yang, Di Liu, Kang Li, Yiting Lin, Hanyi Liu, Rong Yin, Tian Tang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) are increasingly adopted in clinical decision support, yet aligning them with the multifaceted reasoning pathways of real-world medicine remains a major challenge. Using more than 8,000 infertility treatment records, we systematically evaluate four alignment strategies: Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), Group Relative Policy Optimization (GRPO), and In-Context Learning (ICL) through a dual-layer framework combining automatic benchmar...

ID: 2511.18084v1 cs.LG, cs.AI

arXiv PDF

📄 A New Error Temporal Difference Algorithm for Deep Reinforcement Learning in Microgrid Optimization

2025-11-26

Авторы:

Fulong Yao, Wanqing Zhao, Matthew Forshaw

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Predictive control approaches based on deep reinforcement learning (DRL) have gained significant attention in microgrid energy optimization. However, existing research often overlooks the issue of uncertainty stemming from imperfect prediction models, which can lead to suboptimal control strategies. This paper presents a new error temporal difference (ETD) algorithm for DRL to address the uncertainty in predictions,aiming to improve the performance of microgrid operations. First,a microgrid syst...

ID: 2511.18093v1 cs.LG, cs.AI

arXiv PDF

📄 Graph Neural Networks vs Convolutional Neural Networks for Graph Domination Number Prediction

2025-11-26

Авторы:

Randy Davila, Beyzanur Ispir

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We investigate machine learning approaches to approximating the \emph{domination number} of graphs, the minimum size of a dominating set. Exact computation of this parameter is NP-hard, restricting classical methods to small instances. We compare two neural paradigms: Convolutional Neural Networks (CNNs), which operate on adjacency matrix representations, and Graph Neural Networks (GNNs), which learn directly from graph structure through message passing. Across 2,000 random graphs with up to 64 ...

ID: 2511.18150v1 cs.LG, cs.AI, math.CO

arXiv PDF

📄 Exploiting the Experts: Unauthorized Compression in MoE-LLMs

2025-11-26

Авторы:

Pinaki Prasad Guha Neogi, Ahmad Mohammadshirazi, Dheeraj Kulshrestha, Rajiv Ramnath

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Mixture-of-Experts (MoE) architectures are increasingly adopted in large language models (LLMs) for their scalability and efficiency. However, their modular structure introduces a unique vulnerability: adversaries can attempt to compress or repurpose models by pruning experts and cheaply fine-tuning the remainder, effectively bypassing licensing and security constraints. In this paper, we systematically study the prunability of MoE-LLMs under task-specific usage. We first develop an expert attri...

ID: 2511.19480v1 cs.LG, cs.AI

arXiv PDF

📄 MOMA-AC: A preference-driven actor-critic framework for continuous multi-objective multi-agent reinforcement learning

2025-11-26

Авторы:

Adam Callaghan, Karl Mason, Patrick Mannion

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper addresses a critical gap in Multi-Objective Multi-Agent Reinforcement Learning (MOMARL) by introducing the first dedicated inner-loop actor-critic framework for continuous state and action spaces: Multi-Objective Multi-Agent Actor-Critic (MOMA-AC). Building on single-objective, single-agent algorithms, we instantiate this framework with Twin Delayed Deep Deterministic Policy Gradient (TD3) and Deep Deterministic Policy Gradient (DDPG), yielding MOMA-TD3 and MOMA-DDPG. The framework co...

ID: 2511.18181v1 cs.LG, cs.AI

arXiv PDF

📄 MultiDiffNet: A Multi-Objective Diffusion Framework for Generalizable Brain Decoding

2025-11-26

Авторы:

Mengchun Zhang, Kateryna Shapovalenko, Yucheng Shao, Eddie Guo, Parusha Pradhan

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Neural decoding from electroencephalography (EEG) remains fundamentally limited by poor generalization to unseen subjects, driven by high inter-subject variability and the lack of large-scale datasets to model it effectively. Existing methods often rely on synthetic subject generation or simplistic data augmentation, but these strategies fail to scale or generalize reliably. We introduce \textit{MultiDiffNet}, a diffusion-based framework that bypasses generative augmentation entirely by learning...

ID: 2511.18294v1 cs.LG, cs.AI, cs.HC, q-bio.NC

arXiv PDF

📄 Efficient Inference Using Large Language Models with Limited Human Data: Fine-Tuning then Rectification

2025-11-26

Авторы:

Lei Wang, Zikun Ye, Jinglong Zhao

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Driven by recent advances in artificial intelligence (AI), a growing body of work demonstrates the potential of using large language models (LLMs) to generate human-like responses in market research and social science applications. Two primary approaches can be applied to improve the performance of LLMs: fine-tuning, which aligns LLM predictions more closely with human responses, and rectification, which corrects biases in LLM outputs. In this paper, we develop a framework that combines fine-tun...

ID: 2511.19486v1 cs.LG, cs.AI

arXiv PDF

📄 AnyExperts: On-Demand Expert Allocation for Multimodal Language Models with Mixture of Expert

2025-11-26

Авторы:

Yuting Gao, Wang Lan, Hengyuan Zhao, Linjiang Huang, Si Liu, Qingpei Guo

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multimodal Mixture-of-Experts (MoE) models offer a promising path toward scalable and efficient large vision-language systems. However, existing approaches rely on rigid routing strategies (typically activating a fixed number of experts per token) ignoring the inherent heterogeneity in semantic importance across modalities. This leads to suboptimal compute allocation, where redundant tokens consume as many resources as critical ones. To address this, we propose AnyExperts, a novel on-demand, bud...

ID: 2511.18314v1 cs.LG, cs.AI

arXiv PDF

📄 Clinician-in-the-Loop Smart Home System to Detect Urinary Tract Infection Flare-Ups via Uncertainty-Aware Decision Support

2025-11-26

Авторы:

Chibuike E. Ugwu, Roschelle Fritz, Diane J. Cook, Janardhan Rao Doppa

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Urinary tract infection (UTI) flare-ups pose a significant health risk for older adults with chronic conditions. These infections often go unnoticed until they become severe, making early detection through innovative smart home technologies crucial. Traditional machine learning (ML) approaches relying on simple binary classification for UTI detection offer limited utility to nurses and practitioners as they lack insight into prediction uncertainty, hindering informed clinical decision-making. Th...

ID: 2511.18334v1 cs.LG, cs.AI

arXiv PDF

Показано 241 - 250 из 2901 записей