📊 Статистика дайджестов

Всего дайджестов: 35039 Добавлено сегодня: 432

Последнее обновление: сегодня

📄 Exploring the Hierarchical Reasoning Model for Small Natural-Image Classification Without Augmentation

2025-10-08

Авторы:

Alexander V. Mantzaris

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper asks whether the Hierarchical Reasoning Model (HRM) with the two Transformer-style modules $(f_L,f_H)$, one step (DEQ-style) training, deep supervision, Rotary Position Embeddings, and RMSNorm can serve as a practical image classifier. It is evaluated on MNIST, CIFAR-10, and CIFAR-100 under a deliberately raw regime: no data augmentation, identical optimizer family with one-epoch warmup then cosine-floor decay, and label smoothing. HRM optimizes stably and performs well on MNIST ($\ap...

ID: 2510.03598v1 cs.CV, cs.LG

arXiv PDF

📄 Unsupervised Transformer Pre-Training for Images: Self-Distillation, Mean Teachers, and Random Crops

2025-10-08

Авторы:

Mattia Scardecchia

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Recent advances in self-supervised learning (SSL) have made it possible to learn general-purpose visual features that capture both the high-level semantics and the fine-grained spatial structure of images. Most notably, the recent DINOv2 has established a new state of the art by surpassing weakly supervised methods (WSL) like OpenCLIP on most benchmarks. In this survey, we examine the core ideas behind its approach, multi-crop view augmentation and self-distillation with a mean teacher, and trac...

ID: 2510.03606v1 cs.CV, cs.LG, eess.IV

arXiv PDF

📄 Mapping Rio de Janeiro's favelas: general-purpose vs. satellite-specific neural networks

2025-10-08

Авторы:

Thomas Hallopeau, Joris Guérin, Laurent Demagistri, Youssef Fouzai, Renata Gracie, Vanderlei Pascoal De Matos, Helen Gurgel, Nadine Dessay

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

While deep learning methods for detecting informal settlements have already been developed, they have not yet fully utilized the potential offered by recent pretrained neural networks. We compare two types of pretrained neural networks for detecting the favelas of Rio de Janeiro: 1. Generic networks pretrained on large diverse datasets of unspecific images, 2. A specialized network pretrained on satellite imagery. While the latter is more specific to the target task, the former has been pretrain...

ID: 2510.03725v1 cs.CV, cs.LG

arXiv PDF

📄 Road Damage and Manhole Detection using Deep Learning for Smart Cities: A Polygonal Annotation Approach

2025-10-08

Авторы:

Rasel Hossen, Diptajoy Mistry, Mushiur Rahman, Waki As Sami Atikur Rahman Hridoy, Sajib Saha, Muhammad Ibrahim

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Urban safety and infrastructure maintenance are critical components of smart city development. Manual monitoring of road damages is time-consuming, highly costly, and error-prone. This paper presents a deep learning approach for automated road damage and manhole detection using the YOLOv9 algorithm with polygonal annotations. Unlike traditional bounding box annotation, we employ polygonal annotations for more precise localization of road defects. We develop a novel dataset comprising more than o...

ID: 2510.03797v1 cs.CV, cs.LG

arXiv PDF

📄 Keep It on a Leash: Controllable Pseudo-label Generation Towards Realistic Long-Tailed Semi-Supervised Learning

2025-10-08

Авторы:

Yaxin Hou, Bo Han, Yuheng Jia, Hui Liu, Junhui Hou

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Current long-tailed semi-supervised learning methods assume that labeled data exhibit a long-tailed distribution, and unlabeled data adhere to a typical predefined distribution (i.e., long-tailed, uniform, or inverse long-tailed). However, the distribution of the unlabeled data is generally unknown and may follow an arbitrary distribution. To tackle this challenge, we propose a Controllable Pseudo-label Generation (CPG) framework, expanding the labeled dataset with the progressively identified r...

ID: 2510.03993v2 cs.CV, cs.LG

arXiv PDF

📄 From Segments to Concepts: Interpretable Image Classification via Concept-Guided Segmentation

2025-10-08

Авторы:

Ran Eisenberg, Amit Rozner, Ethan Fetaya, Ofir Lindenbaum

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Deep neural networks have achieved remarkable success in computer vision; however, their black-box nature in decision-making limits interpretability and trust, particularly in safety-critical applications. Interpretability is crucial in domains where errors have severe consequences. Existing models not only lack transparency but also risk exploiting unreliable or misleading features, which undermines both robustness and the validity of their explanations. Concept Bottleneck Models (CBMs) aim to ...

ID: 2510.04180v1 cs.CV, cs.LG

arXiv PDF

📄 Detection of retinal diseases using an accelerated reused convolutional network

2025-10-08

Авторы:

Amin Ahmadi Kasani, Hedieh Sajedi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Convolutional neural networks are continually evolving, with some efforts aimed at improving accuracy, others at increasing speed, and some at enhancing accessibility. Improving accessibility broadens the application of neural networks across a wider range of tasks, including the detection of eye diseases. Early diagnosis of eye diseases and consulting an ophthalmologist can prevent many vision disorders. Given the importance of this issue, various datasets have been collected from the cornea to...

ID: 2510.04232v1 cs.CV, cs.LG

arXiv PDF

📄 Fast Witness Persistence for MRI Volumes via Hybrid Landmarking

2025-10-08

Авторы:

Jorge Leonardo Ruiz Williams

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We introduce a scalable witness-based persistent homology pipeline for full-brain MRI volumes that couples density-aware landmark selection with a GPU-ready witness filtration. Candidates are scored by a hybrid metric that balances geometric coverage against inverse kernel density, yielding landmark sets that shrink mean pairwise distances by 30-60% over random or density-only baselines while preserving topological features. Benchmarks on BrainWeb, IXI, and synthetic manifolds execute in under t...

ID: 2510.04553v1 cs.CG, cs.CV, cs.LG

arXiv PDF

📄 Beyond the Seen: Bounded Distribution Estimation for Open-Vocabulary Learning

2025-10-08

Авторы:

Xiaomeng Fan, Yuchuan Mao, Zhi Gao, Yuwei Wu, Jin Chen, Yunde Jia

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Open-vocabulary learning requires modeling the data distribution in open environments, which consists of both seen-class and unseen-class data. Existing methods estimate the distribution in open environments using seen-class data, where the absence of unseen classes makes the estimation error inherently unidentifiable. Intuitively, learning beyond the seen classes is crucial for distribution estimation to bound the estimation error. We theoretically demonstrate that the distribution can be...

ID: 2510.04770v1 cs.CV, cs.LG

arXiv PDF

📄 Federated Learning for Surgical Vision in Appendicitis Classification: Results of the FedSurg EndoVis 2024 Challenge

2025-10-08

Авторы:

Max Kirchner, Hanna Hoffmann, Alexander C. Jenke, Oliver L. Saldanha, Kevin Pfeiffer, Weam Kanjo, Julia Alekseenko, Claas de Boer, Santhi Raj Kolamuri, Lorenzo Mazza, Nicolas Padoy, Sophia Bano, Annika Reinke, Lena Maier-Hein, Danail Stoyanov, Jakob N. Kather, Fiona R. Kolbinger, Sebastian Bodenstedt, Stefanie Speidel

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Purpose: The FedSurg challenge was designed to benchmark the state of the art in federated learning for surgical video classification. Its goal was to assess how well current methods generalize to unseen clinical centers and adapt through local fine-tuning while enabling collaborative model development without sharing patient data. Methods: Participants developed strategies to classify inflammation stages in appendicitis using a preliminary version of the multi-center Appendix300 video dataset. ...

ID: 2510.04772v1 cs.CV, cs.LG

arXiv PDF

Показано 421 - 430 из 863 записей