📊 Статистика дайджестов

Всего дайджестов: 35039 Добавлено сегодня: 432

Последнее обновление: сегодня

📄 VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL

2025-10-07

Авторы:

Kyoungjun Park, Yifan Yang, Juheon Yi, Shicheng Zheng, Yifei Shen, Dongqi Han, Caihua Shan, Muhammad Muaz, Lili Qiu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

With the rapid advancement of AI-generated videos, there is an urgent need for effective detection tools to mitigate societal risks such as misinformation and reputational harm. In addition to accurate classification, it is essential that detection models provide interpretable explanations to ensure transparency for regulators and end users. To address these challenges, we introduce VidGuard-R1, the first video authenticity detector that fine-tunes a multi-modal large language model (MLLM) using...

ID: 2510.02282v2 cs.CV, cs.LG

arXiv PDF

📄 Words That Make Language Models Perceive

2025-10-07

Авторы:

Sophie L. Wang, Phillip Isola, Brian Cheung

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) trained purely on text ostensibly lack any direct perceptual experience, yet their internal representations are implicitly shaped by multimodal regularities encoded in language. We test the hypothesis that explicit sensory prompting can surface this latent structure, bringing a text-only LLM into closer representational alignment with specialist vision and audio encoders. When a sensory prompt tells the model to 'see' or 'hear', it cues the model to resolve its next-...

ID: 2510.02425v1 cs.CL, cs.CV, cs.LG

arXiv PDF

📄 A Statistical Method for Attack-Agnostic Adversarial Attack Detection with Compressive Sensing Comparison

2025-10-07

Авторы:

Chinthana Wimalasuriya, Spyros Tragoudas

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Adversarial attacks present a significant threat to modern machine learning systems. Yet, existing detection methods often lack the ability to detect unseen attacks or detect different attack types with a high level of accuracy. In this work, we propose a statistical approach that establishes a detection baseline before a neural network's deployment, enabling effective real-time adversarial detection. We generate a metric of adversarial presence by comparing the behavior of a compressed/uncompre...

ID: 2510.02707v1 cs.CR, cs.CV, cs.LG, eess.IV

arXiv PDF

📄 ELMF4EggQ: Ensemble Learning with Multimodal Feature Fusion for Non-Destructive Egg Quality Assessment

2025-10-07

Авторы:

Md Zahim Hassan, Md. Osama, Muhammad Ashad Kabir, Md. Saiful Islam, Zannatul Naim

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Accurate, non-destructive assessment of egg quality is critical for ensuring food safety, maintaining product standards, and operational efficiency in commercial poultry production. This paper introduces ELMF4EggQ, an ensemble learning framework that employs multimodal feature fusion to classify egg grade and freshness using only external attributes - image, shape, and weight. A novel, publicly available dataset of 186 brown-shelled eggs was constructed, with egg grade and freshness levels deter...

ID: 2510.02876v1 cs.CV, cs.LG

arXiv PDF

📄 Enhancing Certifiable Semantic Robustness via Robust Pruning of Deep Neural Networks

2025-10-04

Авторы:

Hanjiang Hu, Bowei Li, Ziwei Wang, Tianhao Wei, Casidhe Hutchison, Eric Sample, Changliu Liu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Deep neural networks have been widely adopted in many vision and robotics applications with visual inputs. It is essential to verify its robustness against semantic transformation perturbations, such as brightness and contrast. However, current certified training and robustness certification methods face the challenge of over-parameterization, which hinders the tightness and scalability due to the over-complicated neural networks. To this end, we first analyze stability and variance of layers an...

ID: 2510.00083v1 cs.CV, cs.LG

arXiv PDF

📄 Looking Beyond the Known: Towards a Data Discovery Guided Open-World Object Detection

2025-10-04

Авторы:

Anay Majee, Amitesh Gangrade, Rishabh Iyer

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Open-World Object Detection (OWOD) enriches traditional object detectors by enabling continual discovery and integration of unknown objects via human guidance. However, existing OWOD approaches frequently suffer from semantic confusion between known and unknown classes, alongside catastrophic forgetting, leading to diminished unknown recall and degraded known-class accuracy. To overcome these challenges, we propose Combinatorial Open-World Detection (CROWD), a unified framework reformulating unk...

ID: 2510.00303v1 cs.CV, cs.LG

arXiv PDF

📄 A Deep Learning Pipeline for Epilepsy Genomic Analysis Using GPT-2 XL and NVIDIA H100

2025-10-04

Авторы:

Muhammad Omer Latif, Hayat Ullah, Muhammad Ali Shafique, Zhihua Dong

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Epilepsy is a chronic neurological condition characterized by recurrent seizures, with global prevalence estimated at 50 million people worldwide. While progress in high-throughput sequencing has allowed for broad-based transcriptomic profiling of brain tissues, the deciphering of these highly complex datasets remains one of the challenges. To address this issue, in this paper we propose a new analysis pipeline that integrates the power of deep learning strategies with GPU-acceleration computati...

ID: 2510.00392v1 q-bio.GN, cs.CV, cs.LG

arXiv PDF

📄 Virtual Fashion Photo-Shoots: Building a Large-Scale Garment-Lookbook Dataset

2025-10-04

Авторы:

Yannick Hauri, Luca A. Lanzendörfer, Till Aczel

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Fashion image generation has so far focused on narrow tasks such as virtual try-on, where garments appear in clean studio environments. In contrast, editorial fashion presents garments through dynamic poses, diverse locations, and carefully crafted visual narratives. We introduce the task of virtual fashion photo-shoot, which seeks to capture this richness by transforming standardized garment images into contextually grounded editorial imagery. To enable this new direction, we construct the firs...

ID: 2510.00633v1 cs.CV, cs.LG

arXiv PDF

📄 Multi-Domain Brain Vessel Segmentation Through Feature Disentanglement

2025-10-04

Авторы:

Francesco Galati, Daniele Falcetta, Rosa Cortese, Ferran Prados, Ninon Burgos, Maria A. Zuluaga

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The intricate morphology of brain vessels poses significant challenges for automatic segmentation models, which usually focus on a single imaging modality. However, accurately treating brain-related conditions requires a comprehensive understanding of the cerebrovascular tree, regardless of the specific acquisition procedure. Our framework effectively segments brain arteries and veins in various datasets through image-to-image translation while avoiding domain-specific model design and data harm...

ID: 2510.00665v2 cs.CV, cs.LG

arXiv PDF

📄 A Geometric Unification of Generative AI with Manifold-Probabilistic Projection Models

2025-10-04

Авторы:

Leah Bar, Liron Mor Yosef, Shai Zucker, Neta Shoham, Inbar Seroussi, Nir Sochen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The foundational premise of generative AI for images is the assumption that images are inherently low-dimensional objects embedded within a high-dimensional space. Additionally, it is often implicitly assumed that thematic image datasets form smooth or piecewise smooth manifolds. Common approaches overlook the geometric structure and focus solely on probabilistic methods, approximating the probability distribution through universal approximation techniques such as the kernel method. In some gene...

ID: 2510.00666v1 cs.CV, cs.LG

arXiv PDF

Показано 441 - 450 из 863 записей