📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Are Detectors Fair to Indian IP-AIGC? A Cross-Generator Study

2025-12-04

Авторы:

Vishal Dubey, Pallavi Tyagi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Modern image editors can produce identity-preserving AIGC (IP-AIGC), where the same person appears with new attire, background, or lighting. The robustness and fairness of current detectors in this regime remain unclear, especially for under-represented populations. We present what we believe is the first systematic study of IP-AIGC detection for Indian and South-Asian faces, quantifying cross-generator generalization and intra-population performance. We assemble Indian-focused training splits f...

ID: 2512.02850v1 cs.CV, cs.LG

arXiv PDF

📄 Drainage: A Unifying Framework for Addressing Class Uncertainty

2025-12-04

Авторы:

Yasser Taha, Grégoire Montavon, Nils Körber

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Modern deep learning faces significant challenges with noisy labels, class ambiguity, as well as the need to robustly reject out-of-distribution or corrupted samples. In this work, we propose a unified framework based on the concept of a "drainage node'' which we add at the output of the network. The node serves to reallocate probability mass toward uncertainty, while preserving desirable properties such as end-to-end training and differentiability. This mechanism provides a natural escape route...

ID: 2512.03182v1 cs.CV, cs.LG

arXiv PDF

📄 Flux4D: Flow-based Unsupervised 4D Reconstruction

2025-12-04

Авторы:

Jingkang Wang, Henry Che, Yun Chen, Ze Yang, Lily Goli, Sivabalan Manivasagam, Raquel Urtasun

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Reconstructing large-scale dynamic scenes from visual observations is a fundamental challenge in computer vision, with critical implications for robotics and autonomous systems. While recent differentiable rendering methods such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) have achieved impressive photorealistic reconstruction, they suffer from scalability limitations and require annotations to decouple actor motion. Existing self-supervised methods attempt to eliminate expl...

ID: 2512.03210v1 cs.CV, cs.LG, cs.RO

arXiv PDF

📄 Step-by-step Layered Design Generation

2025-12-04

Авторы:

Faizan Farooq Khan, K J Joseph, Koustava Goswami, Mohamed Elhoseiny, Balaji Vasan Srinivasan

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Design generation, in its essence, is a step-by-step process where designers progressively refine and enhance their work through careful modifications. Despite this fundamental characteristic, existing approaches mainly treat design synthesis as a single-step generation problem, significantly underestimating the inherent complexity of the creative process. To bridge this gap, we propose a novel problem setting called Step-by-Step Layered Design Generation, which tasks a machine learning model wi...

ID: 2512.03335v1 cs.CV, cs.LG

arXiv PDF

📄 KeyPointDiffuser: Unsupervised 3D Keypoint Learning via Latent Diffusion Models

2025-12-04

Авторы:

Rhys Newbury, Juyan Zhang, Tin Tran, Hanna Kurniawati, Dana Kulić

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Understanding and representing the structure of 3D objects in an unsupervised manner remains a core challenge in computer vision and graphics. Most existing unsupervised keypoint methods are not designed for unconditional generative settings, restricting their use in modern 3D generative pipelines; our formulation explicitly bridges this gap. We present an unsupervised framework for learning spatially structured 3D keypoints from point cloud data. These keypoints serve as a compact and interpret...

ID: 2512.03450v1 cs.CV, cs.LG

arXiv PDF

📄 Fairness-Aware Fine-Tuning of Vision-Language Models for Medical Glaucoma Diagnosis

2025-12-04

Авторы:

Zijian Gu, Yuxi Liu, Zhenhao Zhang, Song Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Vision-language models achieve expert-level performance on medical imaging tasks but exhibit significant diagnostic accuracy disparities across demographic groups. We introduce fairness-aware Low-Rank Adaptation for medical VLMs, combining parameter efficiency with explicit fairness optimization. Our key algorithmic contribution is a differentiable MaxAccGap loss that enables end-to-end optimization of accuracy parity across demographic groups. We propose three methods: FR-LoRA integrates MaxAcc...

ID: 2512.03477v1 cs.CV, cs.LG

arXiv PDF

📄 HieroGlyphTranslator: Automatic Recognition and Translation of Egyptian Hieroglyphs to English

2025-12-04

Авторы:

Ahmed Nasser, Marwan Mohamed, Alaa Sherif, Basmala Mahmoud, Shereen Yehia, Asmaa Saad, Mariam S. El-Rahmany, Ensaf H. Mohamed

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Egyptian hieroglyphs, the ancient Egyptian writing system, are composed entirely of drawings. Translating these glyphs into English poses various challenges, including the fact that a single glyph can have multiple meanings. Deep learning translation applications are evolving rapidly, producing remarkable results that significantly impact our lives. In this research, we propose a method for the automatic recognition and translation of ancient Egyptian hieroglyphs from images to English. This stu...

ID: 2512.03817v1 cs.CV, cs.LG

arXiv PDF

📄 Tada-DIP: Input-adaptive Deep Image Prior for One-shot 3D Image Reconstruction

2025-12-04

Авторы:

Evan Bell, Shijun Liang, Ismail Alkhouri, Saiprasad Ravishankar

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Deep Image Prior (DIP) has recently emerged as a promising one-shot neural-network based image reconstruction method. However, DIP has seen limited application to 3D image reconstruction problems. In this work, we introduce Tada-DIP, a highly effective and fully 3D DIP method for solving 3D inverse problems. By combining input-adaptation and denoising regularization, Tada-DIP produces high-quality 3D reconstructions while avoiding the overfitting phenomenon that is common in DIP. Experiments on ...

ID: 2512.03962v1 eess.IV, cs.CV, cs.LG

arXiv PDF

📄 Comparing Baseline and Day-1 Diffusion MRI Using Multimodal Deep Embeddings for Stroke Outcome Prediction

2025-12-04

Авторы:

Sina Raeisadigh, Myles Joshua Toledo Tan, Henning Müller, Abderrahmane Hedjoudje

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This study compares baseline (J0) and 24-hour (J1) diffusion magnetic resonance imaging (MRI) for predicting three-month functional outcomes after acute ischemic stroke (AIS). Seventy-four AIS patients with paired apparent diffusion coefficient (ADC) scans and clinical data were analyzed. Three-dimensional ResNet-50 embeddings were fused with structured clinical variables, reduced via principal component analysis (<=12 components), and classified using linear support vector machines with eight-f...

ID: 2512.02088v1 eess.IV, cs.AI, cs.CV, cs.LG

arXiv PDF

📄 RoaD: Rollouts as Demonstrations for Closed-Loop Supervised Fine-Tuning of Autonomous Driving Policies

2025-12-04

Авторы:

Guillermo Garcia-Cobo, Maximilian Igl, Peter Karkus, Zhejun Zhang, Michael Watson, Yuxiao Chen, Boris Ivanovic, Marco Pavone

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Autonomous driving policies are typically trained via open-loop behavior cloning of human demonstrations. However, such policies suffer from covariate shift when deployed in closed loop, leading to compounding errors. We introduce Rollouts as Demonstrations (RoaD), a simple and efficient method to mitigate covariate shift by leveraging the policy's own closed-loop rollouts as additional training data. During rollout generation, RoaD incorporates expert guidance to bias trajectories toward high-q...

ID: 2512.01993v1 cs.RO, cs.AI, cs.CV, cs.LG

arXiv PDF

Показано 21 - 30 из 835 записей