📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 0

Последнее обновление: сегодня

📄 UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

2025-11-26

Авторы:

Zhaolong Su, Wang Lu, Hao Chen, Sharon Li, Jindong Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Unified Multimodal Models (UMMs) have shown impressive performance in both understanding and generation with a single architecture. However, UMMs still exhibit a fundamental inconsistency: understanding favors compact embeddings, whereas generation favors reconstruction-rich representations. This structural trade-off produces misaligned decision boundaries, degraded cross-modal coherence, and heightened vulnerability under distributional and adversarial shifts. In this paper, we present UniGame,...

ID: 2511.19413v1 cs.LG, cs.AI, cs.CV

arXiv PDF

📄 Many Ways to be Right: Rashomon Sets for Concept-Based Neural Networks

2025-11-26

Авторы:

Shihan Feng, Cheng Zhang, Michael Xi, Ethan Hsu, Lesia Semenova, Chudi Zhong

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Modern neural networks rarely have a single way to be right. For many tasks, multiple models can achieve identical performance while relying on different features or reasoning patterns, a property known as the Rashomon Effect. However, uncovering this diversity in deep architectures is challenging as their continuous parameter spaces contain countless near-optimal solutions that are numerically distinct but often behaviorally similar. We introduce Rashomon Concept Bottleneck Models, a framework ...

ID: 2511.19636v1 cs.LG, cs.AI

arXiv PDF

📄 TiCT: A Synthetically Pre-Trained Foundation Model for Time Series Classification

2025-11-26

Авторы:

Chin-Chia Michael Yeh, Uday Singh Saini, Junpeng Wang, Xin Dai, Xiran Fan, Jiarui Sun, Yujie Fan, Yan Zheng

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The ubiquity of time series data creates a strong demand for general-purpose foundation models, yet developing them for classification remains a significant challenge, largely due to the high cost of labeled data. Foundation models capable of in-context learning (ICL) offer a powerful solution, adapting to new tasks with minimal examples and reducing the need for extensive retraining. However, prior work on large-scale time series models has predominantly focused on forecasting, leaving a critic...

ID: 2511.19694v1 cs.LG, cs.AI

arXiv PDF

📄 TREASURE: A Transformer-Based Foundation Model for High-Volume Transaction Understanding

2025-11-26

Авторы:

Chin-Chia Michael Yeh, Uday Singh Saini, Xin Dai, Xiran Fan, Shubham Jain, Yujie Fan, Jiarui Sun, Junpeng Wang, Menghai Pan, Yingtong Dou, Yuzhong Chen, Vineeth Rakesh, Liang Wang, Yan Zheng, Mahashweta Das

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Payment networks form the backbone of modern commerce, generating high volumes of transaction records from daily activities. Properly modeling this data can enable applications such as abnormal behavior detection and consumer-level insights for hyper-personalized experiences, ultimately improving people's lives. In this paper, we present TREASURE, TRansformer Engine As Scalable Universal transaction Representation Encoder, a multipurpose transformer-based foundation model specifically designed f...

ID: 2511.19693v1 cs.LG, cs.AI

arXiv PDF

📄 Terminal Velocity Matching

2025-11-26

Авторы:

Linqi Zhou, Mathias Parger, Ayaan Haque, Jiaming Song

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We propose Terminal Velocity Matching (TVM), a generalization of flow matching that enables high-fidelity one- and few-step generative modeling. TVM models the transition between any two diffusion timesteps and regularizes its behavior at its terminal time rather than at the initial time. We prove that TVM provides an upper bound on the $2$-Wasserstein distance between data and model distributions when the model is Lipschitz continuous. However, since Diffusion Transformers lack this property, w...

ID: 2511.19797v1 cs.LG, cs.AI, cs.CV, stat.ML

arXiv PDF

📄 Learning to Clean: Reinforcement Learning for Noisy Label Correction

2025-11-26

Авторы:

Marzi Heidari, Hanping Zhang, Yuhong Guo

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The challenge of learning with noisy labels is significant in machine learning, as it can severely degrade the performance of prediction models if not addressed properly. This paper introduces a novel framework that conceptualizes noisy label correction as a reinforcement learning (RL) problem. The proposed approach, Reinforcement Learning for Noisy Label Correction (RLNLC), defines a comprehensive state space representing data and their associated labels, an action space that indicates possible...

ID: 2511.19808v1 cs.LG, cs.AI

arXiv PDF

📄 Mosaic Pruning: A Hierarchical Framework for Generalizable Pruning of Mixture-of-Experts Models

2025-11-26

Авторы:

Wentao Hu, Mingkuan Zhao, Shuangyong Song, Xiaoyan Zhu, Xin Lai, Jiayin Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Sparse Mixture-of-Experts (SMoE) architectures have enabled a new frontier in scaling Large Language Models (LLMs), offering superior performance by activating only a fraction of their total parameters during inference. However, their practical deployment is severely hampered by substantial static memory overhead, as all experts must be loaded into memory. Existing post-training pruning methods, while reducing model size, often derive their pruning criteria from a single, general-purpose corpus....

ID: 2511.19822v1 cs.LG, cs.AI

arXiv PDF

📄 Cisco Time Series Model Technical Report

2025-11-26

Авторы:

Liang Gou, Archit Khare, Praneet Pabolu, Prachi Patel, Joseph Ross, Hercy Shen, Yuhan, Song, Jingze Sun, Kristal Curtis, Vedant Dharnidharka, Abhinav Mathur, Hao Yang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We introduce the Cisco Time Series Model, a univariate zero-shot forecaster. This time series foundation model is the result of a general architectural innovation to a time series model enabling it to accept multiresolution input, applied to a popular decoder-only time series model (TimesFM). The resulting multiresolution decoder-only model is trained on over 300B unique data points, with more than half coming from the observability domain. Quantitative and qualitative evaluations demonstrate th...

ID: 2511.19841v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 GED-Consistent Disentanglement of Aligned and Unaligned Substructures for Graph Similarity Learning

2025-11-26

Авторы:

Zhentao Zhan, Xiaoliang Xu, Jingjing Wang, Junmei Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Graph Similarity Computation (GSC) is a fundamental graph related task where Graph Edit Distance (GED) serves as a prevalent metric. GED is determined by an optimal alignment between a pair of graphs that partitions each into aligned (zero-cost) and unaligned (cost-incurring) substructures. Due to NP-hard nature of exact GED computation, GED approximations based on Graph Neural Network(GNN) have emerged. Existing GNN-based GED approaches typically learn node embeddings for each graph and then ag...

ID: 2511.19837v1 cs.LG, cs.AI, cs.DB

arXiv PDF

📄 Optimize Flip Angle Schedules In MR Fingerprinting Using Reinforcement Learning

2025-11-26

Авторы:

Shenjun Zhong, Zhifeng Chen, Zhaolin Chen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Magnetic Resonance Fingerprinting (MRF) leverages transient-state signal dynamics generated by the tunable acquisition parameters, making the design of an optimal, robust sequence a complex, high-dimensional sequential decision problem, such as optimizing one of the key parameters, flip angle. Reinforcement learning (RL) offers a promising approach to automate parameter selection, to optimize pulse sequences that maximize the distinguishability of fingerprints across the parameter space. In this...

ID: 2511.19941v1 cs.LG, cs.AI, cs.CE

arXiv PDF

Показано 291 - 300 из 2901 записей