📊 Статистика дайджестов

Всего дайджестов: 35039 Добавлено сегодня: 432

Последнее обновление: сегодня

📄 Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation

2025-10-21

Авторы:

Fei Wang, Li Shen, Liang Ding, Chao Xue, Ye Liu, Changxing Ding

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models excel at natural language processing tasks, but their massive size leads to high computational and storage demands. Recent works have sought to reduce their model size through layer-wise structured pruning. However, they tend to ignore retaining the capabilities in the pruned part. In this work, we re-examine structured pruning paradigms and uncover several key limitations: 1) notable performance degradation due to direct layer removal, 2) incompetent linear weight layer ag...

ID: 2510.15304v1 cs.CV, cs.LG

arXiv PDF

📄 RankSEG-RMA: An Efficient Segmentation Algorithm via Reciprocal Moment Approximation

2025-10-21

Авторы:

Zixun Wang, Ben Dai

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Semantic segmentation labels each pixel in an image with its corresponding class, and is typically evaluated using the Intersection over Union (IoU) and Dice metrics to quantify the overlap between predicted and ground-truth segmentation masks. In the literature, most existing methods estimate pixel-wise class probabilities, then apply argmax or thresholding to obtain the final prediction. These methods have been shown to generally lead to inconsistent or suboptimal results, as they do not direc...

ID: 2510.15362v1 stat.ML, cs.CV, cs.LG

arXiv PDF

📄 LILAC: Long-sequence Incremental Low-latency Arbitrary Motion Stylization via Streaming VAE-Diffusion with Causal Decoding

2025-10-21

Авторы:

Peng Ren, Hai Yang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Generating long and stylized human motions in real time is critical for applications that demand continuous and responsive character control. Despite its importance, existing streaming approaches often operate directly in the raw motion space, leading to substantial computational overhead and making it difficult to maintain temporal stability. In contrast, latent-space VAE-Diffusion-based frameworks alleviate these issues and achieve high-quality stylization, but they are generally confined to o...

ID: 2510.15392v1 cs.CV, cs.LG

arXiv PDF

📄 Semantic4Safety: Causal Insights from Zero-shot Street View Imagery Segmentation for Urban Road Safety

2025-10-21

Авторы:

Huan Chen, Ting Han, Siyu Chen, Zhihao Guo, Yiping Chen, Meiliu Wu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Street-view imagery (SVI) offers a fine-grained lens on traffic risk, yet two fundamental challenges persist: (1) how to construct street-level indicators that capture accident-related features, and (2) how to quantify their causal impacts across different accident types. To address these challenges, we propose Semantic4Safety, a framework that applies zero-shot semantic segmentation to SVIs to derive 11 interpretable streetscape indicators, and integrates road type as contextual information to ...

ID: 2510.15434v1 cs.CV, cs.LG

arXiv PDF

📄 Latent Feature Alignment: Discovering Biased and Interpretable Subpopulations in Face Recognition Models

2025-10-21

Авторы:

Ignacio Serna

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Modern face recognition models achieve high overall accuracy but continue to exhibit systematic biases that disproportionately affect certain subpopulations. Conventional bias evaluation frameworks rely on labeled attributes to form subpopulations, which are expensive to obtain and limited to predefined categories. We introduce Latent Feature Alignment (LFA), an attribute-label-free algorithm that uses latent directions to identify subpopulations. This yields two main benefits over standard clus...

ID: 2510.15520v1 cs.CV, cs.LG

arXiv PDF

📄 VO-DP: Semantic-Geometric Adaptive Diffusion Policy for Vision-Only Robotic Manipulation

2025-10-21

Авторы:

Zehao Ni, Yonghao He, Lingfeng Qian, Jilei Mao, Fa Fu, Wei Sui, Hu Su, Junran Peng, Zhipeng Wang, Bin He

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In the context of imitation learning, visuomotor-based diffusion policy learning is one of the main directions in robotic manipulation. Most of these approaches rely on point clouds as observation inputs and construct scene representations through point clouds feature learning, which enables them to achieve remarkable accuracy. However, the existing literature lacks an in-depth exploration of vision-only solutions that have significant potential. In this paper, we propose a Vision-Only and singl...

ID: 2510.15530v1 cs.RO, cs.CV, cs.LG

arXiv PDF

📄 QSilk: Micrograin Stabilization and Adaptive Quantile Clipping for Detail-Friendly Latent Diffusion

2025-10-21

Авторы:

Denis Rychkovskiy

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We present QSilk, a lightweight, always-on stabilization layer for latent diffusion that improves high-frequency fidelity while suppressing rare activation spikes. QSilk combines (i) a per-sample micro clamp that gently limits extreme values without washing out texture, and (ii) Adaptive Quantile Clip (AQClip), which adapts the allowed value corridor per region. AQClip can operate in a proxy mode using local structure statistics or in an attention entropy guided mode (model confidence). Integrat...

ID: 2510.15761v1 cs.CV, cs.LG, 68T07, 68U10, I.2.10; I.4.8; I.4.9

arXiv PDF

📄 Towards more holistic interpretability: A lightweight disentangled Concept Bottleneck Model

2025-10-21

Авторы:

Gaoxiang Huang, Songning Lai, Yutao Yue

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Concept Bottleneck Models (CBMs) enhance interpretability by predicting human-understandable concepts as intermediate representations. However, existing CBMs often suffer from input-to-concept mapping bias and limited controllability, which restricts their practical value, directly damage the responsibility of strategy from concept-based methods. We propose a lightweight Disentangled Concept Bottleneck Model (LDCBM) that automatically groups visual features into semantically meaningful component...

ID: 2510.15770v1 cs.CV, cs.LG

arXiv PDF

📄 Post-surgical Endometriosis Segmentation in Laparoscopic Videos

2025-10-18

Авторы:

Andreas Leibetseder, Klaus Schoeffmann, Jörg Keckstein, Simon Keckstein

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Endometriosis is a common women's condition exhibiting a manifold visual appearance in various body-internal locations. Having such properties makes its identification very difficult and error-prone, at least for laymen and non-specialized medical practitioners. In an attempt to provide assistance to gynecologic physicians treating endometriosis, this demo paper describes a system that is trained to segment one frequently occurring visual appearance of endometriosis, namely dark endometrial impl...

ID: 2510.13899v1 cs.CV, cs.LG, cs.MM

arXiv PDF

📄 PoissonNet: A Local-Global Approach for Learning on Surfaces

2025-10-18

Авторы:

Arman Maesumi, Tanish Makadia, Thibault Groueix, Vladimir G. Kim, Daniel Ritchie, Noam Aigerman

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Many network architectures exist for learning on meshes, yet their constructions entail delicate trade-offs between difficulty learning high-frequency features, insufficient receptive field, sensitivity to discretization, and inefficient computational overhead. Drawing from classic local-global approaches in mesh processing, we introduce PoissonNet, a novel neural architecture that overcomes all of these deficiencies by formulating a local-global learning scheme, which uses Poisson's equation as...

ID: 2510.14146v1 cs.GR, cs.CV, cs.LG

arXiv PDF

Показано 331 - 340 из 863 записей