📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Data Efficiency and Transfer Robustness in Biomedical Image Segmentation: A Study of Redundancy and Forgetting with Cellpose

2025-11-11

Авторы:

Shuo Zhao, Jianxu Chen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Generalist biomedical image segmentation models such as Cellpose are increasingly applied across diverse imaging modalities and cell types. However, two critical challenges remain underexplored: (1) the extent of training data redundancy and (2) the impact of cross domain transfer on model retention. In this study, we conduct a systematic empirical analysis of these challenges using Cellpose as a case study. First, to assess data redundancy, we propose a simple dataset quantization (DQ) strategy...

ID: 2511.04803v1 cs.CV, cs.AI, cs.LG, I.2.10; I.4.6

arXiv PDF

📄 An Active Learning Pipeline for Biomedical Image Instance Segmentation with Minimal Human Intervention

2025-11-11

Авторы:

Shuo Zhao, Yu Zhou, Jianxu Chen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Biomedical image segmentation is critical for precise structure delineation and downstream analysis. Traditional methods often struggle with noisy data, while deep learning models such as U-Net have set new benchmarks in segmentation performance. nnU-Net further automates model configuration, making it adaptable across datasets without extensive tuning. However, it requires a substantial amount of annotated data for cross-validation, posing a challenge when only raw images but no labels are avai...

ID: 2511.04811v1 cs.CV, cs.AI, cs.LG, 68T07, 68U10, I.2.10; I.4.6; J.3

arXiv PDF

📄 Learning Fourier shapes to probe the geometric world of deep neural networks

2025-11-11

Авторы:

Jian Wang, Yixing Yong, Haixia Bi, Lijun He, Fan Li

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

While both shape and texture are fundamental to visual recognition, research on deep neural networks (DNNs) has predominantly focused on the latter, leaving their geometric understanding poorly probed. Here, we show: first, that optimized shapes can act as potent semantic carriers, generating high-confidence classifications from inputs defined purely by their geometry; second, that they are high-fidelity interpretability tools that precisely isolate a model's salient regions; and third, that the...

ID: 2511.04970v1 cs.CV, cs.AI, cs.LG

arXiv PDF

📄 Rethinking Metrics and Diffusion Architecture for 3D Point Cloud Generation

2025-11-11

Авторы:

Matteo Bastico, David Ryckelynck, Laurent Corté, Yannick Tillier, Etienne Decencière

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

As 3D point clouds become a cornerstone of modern technology, the need for sophisticated generative models and reliable evaluation metrics has grown exponentially. In this work, we first expose that some commonly used metrics for evaluating generated point clouds, particularly those based on Chamfer Distance (CD), lack robustness against defects and fail to capture geometric fidelity and local shape consistency when used as quality indicators. We further show that introducing samples alignment p...

ID: 2511.05308v1 cs.CV, cs.AI, cs.LG

arXiv PDF

📄 Investigating Robot Control Policy Learning for Autonomous X-ray-guided Spine Procedures

2025-11-08

Авторы:

Florence Klitzner, Blanca Inigo, Benjamin D. Killeen, Lalithkumar Seenivasan, Michelle Song, Axel Krieger, Mathias Unberath

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Imitation learning-based robot control policies are enjoying renewed interest in video-based robotics. However, it remains unclear whether this approach applies to X-ray-guided procedures, such as spine instrumentation. This is because interpretation of multi-view X-rays is complex. We examine opportunities and challenges for imitation policy learning in bi-plane-guided cannula insertion. We develop an in silico sandbox for scalable, automated simulation of X-ray-guided spine procedures with a h...

ID: 2511.03882v1 cs.CV, cs.AI, cs.LG, cs.RO

arXiv PDF

📄 MedSapiens: Taking a Pose to Rethink Medical Imaging Landmark Detection

2025-11-08

Авторы:

Marawan Elbatel, Anbang Wang, Keyuan Liu, Kaouther Mouheb, Enrique Almar-Munoz, Lizhuo Lin, Yanqi Yang, Karim Lekadir, Xiaomeng Li

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper does not introduce a novel architecture; instead, it revisits a fundamental yet overlooked baseline: adapting human-centric foundation models for anatomical landmark detection in medical imaging. While landmark detection has traditionally relied on domain-specific models, the emergence of large-scale pre-trained vision models presents new opportunities. In this study, we investigate the adaptation of Sapiens, a human-centric foundation model designed for pose estimation, to medical im...

ID: 2511.04255v1 cs.CV, cs.AI, cs.LG

arXiv PDF

📄 Occlusion-Aware Diffusion Model for Pedestrian Intention Prediction

2025-11-07

Авторы:

Yu Liu, Zhijie Liu, Zedong Yang, You-Fu Li, He Kong

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Predicting pedestrian crossing intentions is crucial for the navigation of mobile robots and intelligent vehicles. Although recent deep learning-based models have shown significant success in forecasting intentions, few consider incomplete observation under occlusion scenarios. To tackle this challenge, we propose an Occlusion-Aware Diffusion Model (ODM) that reconstructs occluded motion patterns and leverages them to guide future intention prediction. During the denoising stage, we introduce an...

ID: 2511.00858v1 cs.CV, cs.AI, cs.LG

arXiv PDF

📄 GeoToken: Hierarchical Geolocalization of Images via Next Token Prediction

2025-11-07

Авторы:

Narges Ghasemi, Amir Ziashahabi, Salman Avestimehr, Cyrus Shahabi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Image geolocalization, the task of determining an image's geographic origin, poses significant challenges, largely due to visual similarities across disparate locations and the large search space. To address these issues, we propose a hierarchical sequence prediction approach inspired by how humans narrow down locations from broad regions to specific addresses. Analogously, our model predicts geographic tokens hierarchically, first identifying a general region and then sequentially refining pred...

ID: 2511.01082v1 cs.CV, cs.AI, cs.LG

arXiv PDF

📄 SliceVision-F2I: A Synthetic Feature-to-Image Dataset for Visual Pattern Representation on Network Slices

2025-11-07

Авторы:

Md. Abid Hasan Rafi, Mst. Fatematuj Johora, Pankaj Bhowmik

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The emergence of 5G and 6G networks has established network slicing as a significant part of future service-oriented architectures, demanding refined identification methods supported by robust datasets. The article presents SliceVision-F2I, a dataset of synthetic samples for studying feature visualization in network slicing for next-generation networking systems. The dataset transforms multivariate Key Performance Indicator (KPI) vectors into visual representations through four distinct encoding...

ID: 2511.01087v1 cs.CV, cs.AI, cs.LG

arXiv PDF

📄 EvtSlowTV -- A Large and Diverse Dataset for Event-Based Depth Estimation

2025-11-07

Авторы:

Sadiq Layi Macaulay, Nimet Kaygusuz, Simon Hadfield

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Event cameras, with their high dynamic range (HDR) and low latency, offer a promising alternative for robust depth estimation in challenging environments. However, many event-based depth estimation approaches are constrained by small-scale annotated datasets, limiting their generalizability to real-world scenarios. To bridge this gap, we introduce EvtSlowTV, a large-scale event camera dataset curated from publicly available YouTube footage, which contains more than 13B events across various envi...

ID: 2511.02953v1 cs.CV, cs.AI, cs.LG, cs.RO

arXiv PDF

Показано 101 - 110 из 358 записей