📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 OmniLens++: Blind Lens Aberration Correction via Large LensLib Pre-Training and Latent PSF Representation

2025-11-26

Авторы:

Qi Jiang, Xiaolong Qian, Yao Gao, Lei Sun, Kailun Yang, Zhonghua Yi, Wenyong Li, Ming-Hsuan Yang, Luc Van Gool, Kaiwei Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Emerging deep-learning-based lens library pre-training (LensLib-PT) pipeline offers a new avenue for blind lens aberration correction by training a universal neural network, demonstrating strong capability in handling diverse unknown optical degradations. This work proposes the OmniLens++ framework, which resolves two challenges that hinder the generalization ability of existing pipelines: the difficulty of scaling data and the absence of prior guidance characterizing optical degradation. To imp...

ID: 2511.17126v3 eess.IV, cs.CV, cs.LG, physics.optics

arXiv PDF

📄 AdaPerceiver: Transformers with Adaptive Width, Depth, and Tokens

2025-11-26

Авторы:

Purvish Jajal, Nick John Eliopoulos, Benjamin Shiue-Hal Chou, George K. Thiruvathukal, Yung-Hsiang Lu, James C. Davis

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Modern transformer architectures achieve remarkable performance across tasks and domains but remain rigid in how they allocate computation at inference time. Real-world deployment often requires models to adapt to diverse hardware and latency constraints, yet most approaches to dynamic computation focus on a single axis -- such as reducing the number of tokens. We present a novel capability: AdaPerceiver, the first transformer architecture with unified adaptivity across depth, width, and tokens ...

ID: 2511.18105v1 cs.CV, cs.LG

arXiv PDF

📄 A Multi-Stage Deep Learning Framework with PKCP-MixUp Augmentation for Pediatric Liver Tumor Diagnosis Using Multi-Phase Contrast-Enhanced CT

2025-11-26

Авторы:

Wanqi Wang, Chun Yang, Jianbo Shao, Yaokai Zhang, Xuehua Peng, Jin Sun, Chao Xiong, Long Lu, Lianting Hu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Pediatric liver tumors are one of the most common solid tumors in pediatrics, with differentiation of benign or malignant status and pathological classification critical for clinical treatment. While pathological examination is the gold standard, the invasive biopsy has notable limitations: the highly vascular pediatric liver and fragile tumor tissue raise complication risks such as bleeding; additionally, young children with poor compliance require anesthesia for biopsy, increasing medical cost...

ID: 2511.19478v1 eess.IV, cs.CV, cs.LG

arXiv PDF

📄 Observer Actor: Active Vision Imitation Learning with Sparse View Gaussian Splatting

2025-11-26

Авторы:

Yilong Wang, Cheng Qian, Ruomeng Fan, Edward Johns

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We propose Observer Actor (ObAct), a novel framework for active vision imitation learning in which the observer moves to optimal visual observations for the actor. We study ObAct on a dual-arm robotic system equipped with wrist-mounted cameras. At test time, ObAct dynamically assigns observer and actor roles: the observer arm constructs a 3D Gaussian Splatting (3DGS) representation from three images, virtually explores this to find an optimal camera pose, then moves to this pose; the actor arm t...

ID: 2511.18140v1 cs.RO, cs.CV, cs.LG

arXiv PDF

📄 AVERY: Adaptive VLM Split Computing through Embodied Self-Awareness for Efficient Disaster Response Systems

2025-11-26

Авторы:

Rajat Bhattacharjya, Sing-Yao Wu, Hyunwoo Oh, Chaewon Nam, Suyeon Koo, Mohsen Imani, Elaheh Bozorgzadeh, Nikil Dutt

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Unmanned Aerial Vehicles (UAVs) in disaster response require complex, queryable intelligence that on-board CNNs cannot provide. While Vision-Language Models (VLMs) offer this semantic reasoning, their high resource demands make on-device deployment infeasible, and naive cloud offloading fails under the low-bandwidth networks common in disaster zones. We present AVERY, a framework that enables VLM deployment through adaptive split computing. We advance the split computing paradigm beyond traditio...

ID: 2511.18151v1 cs.DC, cs.AR, cs.CV, cs.LG, cs.NI

arXiv PDF

📄 Using MLIR Transform to Design Sliced Convolution Algorithm

2025-11-26

Авторы:

Victor Ferrari, Marcio Pereira, Lucas Alvarenga, Gustavo Leite, Guido Araujo

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper proposes SConvTransform, a Transform dialect extension that provides operations for optimizing 2D convolutions in MLIR. Its main operation, SConvOp, lowers Linalg convolutions into tiled and packed generic operations through a fully declarative transformation pipeline. The process is guided by a Convolution Slicing Analysis that determines tile sizes and data layout strategies based on input and filter shapes, as well as target architecture parameters. SConvOp handles edge cases by sp...

ID: 2511.18222v1 cs.CV, cs.LG, cs.PF

arXiv PDF

📄 Learning Visually Interpretable Oscillator Networks for Soft Continuum Robots from Video

2025-11-26

Авторы:

Henrik Krauss, Johann Licher, Naoya Takeishi, Annika Raatz, Takehisa Yairi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Data-driven learning of soft continuum robot (SCR) dynamics from high-dimensional observations offers flexibility but often lacks physical interpretability, while model-based approaches require prior knowledge and can be computationally expensive. We bridge this gap by introducing (1) the Attention Broadcast Decoder (ABCD), a plug-and-play module for autoencoder-based latent dynamics learning that generates pixel-accurate attention maps localizing each latent dimension's contribution while filte...

ID: 2511.18322v1 cs.RO, cs.CV, cs.LG

arXiv PDF

📄 NeuroVascU-Net: A Unified Multi-Scale and Cross-Domain Adaptive Feature Fusion U-Net for Precise 3D Segmentation of Brain Vessels in Contrast-Enhanced T1 MRI

2025-11-26

Авторы:

Mohammad Jafari Vayeghan, Niloufar Delfan, Mehdi Tale Masouleh, Mansour Parvaresh Rizi, Behzad Moshiri

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Precise 3D segmentation of cerebral vasculature from T1-weighted contrast-enhanced (T1CE) MRI is crucial for safe neurosurgical planning. Manual delineation is time-consuming and prone to inter-observer variability, while current automated methods often trade accuracy for computational cost, limiting clinical use. We present NeuroVascU-Net, the first deep learning architecture specifically designed to segment cerebrovascular structures directly from clinically standard T1CE MRI in neuro-oncology...

ID: 2511.18422v1 cs.CV, cs.LG

arXiv PDF

📄 Functional Localization Enforced Deep Anomaly Detection Using Fundus Images

2025-11-26

Авторы:

Jan Benedikt Ruhland, Thorsten Papenbrock, Jan-Peter Sowa, Ali Canbay, Nicole Eter, Bernd Freisleben, Dominik Heider

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Reliable detection of retinal diseases from fundus images is challenged by the variability in imaging quality, subtle early-stage manifestations, and domain shift across datasets. In this study, we systematically evaluated a Vision Transformer (ViT) classifier under multiple augmentation and enhancement strategies across several heterogeneous public datasets, as well as the AEyeDB dataset, a high-quality fundus dataset created in-house and made available for the research community. The ViT demon...

ID: 2511.18627v1 cs.CV, cs.LG

arXiv PDF

📄 Dendritic Convolution for Noise Image Recognition

2025-11-26

Авторы:

Jiarui Xue, Dongjian Yang, Ye Sun, Gang Liu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In real-world scenarios of image recognition, there exists substantial noise interference. Existing works primarily focus on methods such as adjusting networks or training strategies to address noisy image recognition, and the anti-noise performance has reached a bottleneck. However, little is known about the exploration of anti-interference solutions from a neuronal perspective.This paper proposes an anti-noise neuronal convolution. This convolution mimics the dendritic structure of neurons, in...

ID: 2511.18699v1 cs.CV, cs.LG

arXiv PDF

Показано 71 - 80 из 835 записей