📊 Статистика дайджестов

Всего дайджестов: 34123 Добавлено сегодня: 101

Последнее обновление: сегодня

📄 Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach

2025-11-24

Авторы:

Yaoxin Yang, Peng Ye, Xudong Tan, Chongjun Tu, Maosen Zhao, Jia Hao, Tao Chen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multimodal large language models suffer from substantial inference overhead since multimodal KV Cache grows proportionally with the visual input length. Existing multimodal KV Cache compression methods mostly rely on attention score to reduce cache size, which makes them are incompatible with established efficient attention kernels (e.g., FlashAttention) and ignores the contribution of value vectors to the attention output. In this work, we revisit multimodal KV Cache compression from the perspe...

ID: 2511.16786v1 cs.LG, cs.AI, cs.CV

arXiv PDF

📄 DDTime: Dataset Distillation with Spectral Alignment and Information Bottleneck for Time-Series Forecasting

2025-11-24

Авторы:

Yuqi Li, Kuiye Ding, Chuanguang Yang, Hao Wang, Haoxuan Wang, Huiran Duan, Junming Liu, Yingli Tian

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Time-series forecasting is fundamental across many domains, yet training accurate models often requires large-scale datasets and substantial computational resources. Dataset distillation offers a promising alternative by synthesizing compact datasets that preserve the learning behavior of full data. However, extending dataset distillation to time-series forecasting is non-trivial due to two fundamental challenges: 1.temporal bias from strong autocorrelation, which leads to distorted value-term a...

ID: 2511.16715v1 cs.LG, cs.AI

arXiv PDF

📄 Physics-Guided Inductive Spatiotemporal Kriging for PM2.5 with Satellite Gradient Constraints

2025-11-22

Авторы:

Shuo Wang, Mengfan Teng, Yun Cheng, Lothar Thiele, Olga Saukh, Shuangshuang He, Yuanting Zhang, Jiang Zhang, Gangfeng Zhang, Xingyuan Yuan, Jingfang Fan

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

High-resolution mapping of fine particulate matter (PM2.5) is a cornerstone of sustainable urbanism but remains critically hindered by the spatial sparsity of ground monitoring networks. While traditional data-driven methods attempt to bridge this gap using satellite Aerosol Optical Depth (AOD), they often suffer from severe, non-random data missingness (e.g., due to cloud cover or nighttime) and inversion biases. To overcome these limitations, this study proposes the Spatiotemporal Physics-Guid...

ID: 2511.16013v1 cs.LG, cs.AI

arXiv PDF

📄 Synergizing Deconfounding and Temporal Generalization For Time-series Counterfactual Outcome Estimation

2025-11-22

Авторы:

Yiling Liu, Juncheng Dong, Chen Fu, Wei Shi, Ziyang Jiang, Zhigang Hua, David Carlson

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Estimating counterfactual outcomes from time-series observations is crucial for effective decision-making, e.g. when to administer a life-saving treatment, yet remains significantly challenging because (i) the counterfactual trajectory is never observed and (ii) confounders evolve with time and distort estimation at every step. To address these challenges, we propose a novel framework that synergistically integrates two complementary approaches: Sub-treatment Group Alignment (SGA) and Random Tem...

ID: 2511.16006v1 cs.LG, cs.AI

arXiv PDF

📄 CARE: Turning LLMs Into Causal Reasoning Expert

2025-11-22

Авторы:

Juncheng Dong, Yiling Liu, Ahmed Aloui, Vahid Tarokh, David Carlson

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) have recently demonstrated impressive capabilities across a range of reasoning and generation tasks. However, research studies have shown that LLMs lack the ability to identify causal relationships, a fundamental cornerstone of human intelligence. We first conduct an exploratory investigation of LLMs' behavior when asked to perform a causal-discovery task and find that they mostly rely on the semantic meaning of variable names, ignoring the observation data. This is ...

ID: 2511.16016v1 cs.LG, cs.AI

arXiv PDF

📄 HGCN2SP: Hierarchical Graph Convolutional Network for Two-Stage Stochastic Programming

2025-11-22

Авторы:

Yang Wu, Yifan Zhang, Zhenxing Liang, Jian Cheng

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Two-stage Stochastic Programming (2SP) is a standard framework for modeling decision-making problems under uncertainty. While numerous methods exist, solving such problems with many scenarios remains challenging. Selecting representative scenarios is a practical method for accelerating solutions. However, current approaches typically rely on clustering or Monte Carlo sampling, failing to integrate scenario information deeply and overlooking the significant impact of the scenario order on solving...

ID: 2511.16027v1 cs.LG, cs.AI

arXiv PDF

📄 A Mathematical Framework for Custom Reward Functions in Job Application Evaluation using Reinforcement Learning

2025-11-22

Авторы:

Shreyansh Jain, Madhav Singhvi, Shreya Rahul Jain, Pranav S, Dishaa Lokesh, Naren Chittibabu, Akash Anandhan

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Conventional Applicant Tracking Systems (ATS) tend to be inflexible keyword-matchers, and deny gifted candidates a role due to a few minor semantic mismatches. This article describes a new two-step process to design a more refined resume evaluation model based on a small language model (<600M parameters) that is finetuned using GRPO on a custom reward function. To begin with, Supervised Fine-Tuning (SFT) was used to build a solid baseline model. Second, this SFT model was also optimized with the...

ID: 2511.16073v1 cs.LG, cs.AI, cs.MA

arXiv PDF

📄 Mitigating Estimation Bias with Representation Learning in TD Error-Driven Regularization

2025-11-22

Авторы:

Haohui Chen, Zhiyong Chen, Aoxiang Liu, Wentuo Fang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Deterministic policy gradient algorithms for continuous control suffer from value estimation biases that degrade performance. While double critics reduce such biases, the exploration potential of double actors remains underexplored. Building on temporal-difference error-driven regularization (TDDR), a double actor-critic framework, this work introduces enhanced methods to achieve flexible bias control and stronger representation learning. We propose three convex combination strategies, symmetric...

ID: 2511.16090v1 cs.LG, cs.AI

arXiv PDF

📄 Labels Matter More Than Models: Quantifying the Benefit of Supervised Time Series Anomaly Detection

2025-11-22

Авторы:

Zhijie Zhong, Zhiwen Yu, Kaixiang Yang, C. L. Philip Chen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Time series anomaly detection (TSAD) is a critical data mining task often constrained by label scarcity. Consequently, current research predominantly focuses on Unsupervised Time-series Anomaly Detection (UTAD), relying on complex architectures to model normal data distributions. However, this approach often overlooks the significant performance gains available from limited anomaly labels achievable in practical scenarios. This paper challenges the premise that architectural complexity is the op...

ID: 2511.16145v1 cs.LG, cs.AI

arXiv PDF

📄 Are Foundation Models Useful for Bankruptcy Prediction?

2025-11-22

Авторы:

Marcin Kostrzewa, Oleksii Furman, Roman Furman, Sebastian Tomczak, Maciej Zięba

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Foundation models have shown promise across various financial applications, yet their effectiveness for corporate bankruptcy prediction remains systematically unevaluated against established methods. We study bankruptcy forecasting using Llama-3.3-70B-Instruct and TabPFN, evaluated on large, highly imbalanced datasets of over one million company records from the Visegrád Group. We provide the first systematic comparison of foundation models against classical machine learning baselines for this t...

ID: 2511.16375v1 cs.LG, cs.AI

arXiv PDF

Показано 341 - 350 из 2912 записей