📊 Статистика дайджестов

Всего дайджестов: 35039 Добавлено сегодня: 432

Последнее обновление: сегодня

📄 Real Deep Research for AI, Robotics and Beyond

2025-10-25

Авторы:

Xueyan Zou, Jianglong Ye, Hao Zhang, Xiaoyu Xiang, Mingyu Ding, Zhaojing Yang, Yong Jae Lee, Zhuowen Tu, Sifei Liu, Xiaolong Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

With the rapid growth of research in AI and robotics now producing over 10,000 papers annually it has become increasingly difficult for researchers to stay up to date. Fast evolving trends, the rise of interdisciplinary work, and the need to explore domains beyond one's expertise all contribute to this challenge. To address these issues, we propose a generalizable pipeline capable of systematically analyzing any research area: identifying emerging trends, uncovering cross domain opportunities, a...

ID: 2510.20809v1 cs.AI, cs.CL, cs.CV, cs.LG

arXiv PDF

📄 From See to Shield: ML-Assisted Fine-Grained Access Control for Visual Data

2025-10-24

Авторы:

Mete Harun Akcay, Buse Gul Atli, Siddharth Prakash Rao, Alexandros Bakas

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

As the volume of stored data continues to grow, identifying and protecting sensitive information within large repositories becomes increasingly challenging, especially when shared with multiple users with different roles and permissions. This work presents a system architecture for trusted data sharing with policy-driven access control, enabling selective protection of sensitive regions while maintaining scalability. The proposed architecture integrates four core modules that combine automated d...

ID: 2510.19418v1 cs.CR, cs.CV, cs.LG

arXiv PDF

📄 Exploring "Many in Few" and "Few in Many" Properties in Long-Tailed, Highly-Imbalanced IC Defect Classification

2025-10-24

Авторы:

Hao-Chiang Shao, Chun-Hao Chang, Yu-Hsien Lin, Chia-Wen Lin, Shao-Yun Fang, Yan-Hsiu Liu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Despite significant advancements in deep classification techniques and in-lab automatic optical inspection models for long-tailed or highly imbalanced data, applying these approaches to real-world IC defect classification tasks remains challenging. This difficulty stems from two primary factors. First, real-world conditions, such as the high yield-rate requirements in the IC industry, result in data distributions that are far more skewed than those found in general public imbalanced datasets. Co...

ID: 2510.19463v1 cs.CV, cs.LG

arXiv PDF

📄 PCP-GAN: Property-Constrained Pore-scale image reconstruction via conditional Generative Adversarial Networks

2025-10-24

Авторы:

Ali Sadeghkhani, Brandon Bennett, Masoud Babaei, Arash Rabbani

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Obtaining truly representative pore-scale images that match bulk formation properties remains a fundamental challenge in subsurface characterization, as natural spatial heterogeneity causes extracted sub-images to deviate significantly from core-measured values. This challenge is compounded by data scarcity, where physical samples are only available at sparse well locations. This study presents a multi-conditional Generative Adversarial Network (cGAN) framework that generates representative pore...

ID: 2510.19465v1 cs.CV, cs.LG, physics.geo-ph

arXiv PDF

📄 Uncertainty evaluation of segmentation models for Earth observation

2025-10-24

Авторы:

Melanie Rey, Andriy Mnih, Maxim Neumann, Matt Overlan, Drew Purves

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper investigates methods for estimating uncertainty in semantic segmentation predictions derived from satellite imagery. Estimating uncertainty for segmentation presents unique challenges compared to standard image classification, requiring scalable methods producing per-pixel estimates. While most research on this topic has focused on scene understanding or medical imaging, this work benchmarks existing methods specifically for remote sensing and Earth observation applications. Our evalu...

ID: 2510.19586v1 cs.CV, cs.LG

arXiv PDF

📄 ViBED-Net: Video Based Engagement Detection Network Using Face-Aware and Scene-Aware Spatiotemporal Cues

2025-10-23

Авторы:

Prateek Gothwal, Deeptimaan Banerjee, Ashis Kumer Biswas

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Engagement detection in online learning environments is vital for improving student outcomes and personalizing instruction. We present ViBED-Net (Video-Based Engagement Detection Network), a novel deep learning framework designed to assess student engagement from video data using a dual-stream architecture. ViBED-Net captures both facial expressions and full-scene context by processing facial crops and entire video frames through EfficientNetV2 for spatial feature extraction. These features are ...

ID: 2510.18016v1 cs.CV, cs.LG, I.2.10; I.5.2

arXiv PDF

📄 Efficient Few-shot Identity Preserving Attribute Editing for 3D-aware Deep Generative Models

2025-10-23

Авторы:

Vishal Vinod

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Identity preserving editing of faces is a generative task that enables modifying the illumination, adding/removing eyeglasses, face aging, editing hairstyles, modifying expression etc., while preserving the identity of the face. Recent progress in 2D generative models have enabled photorealistic editing of faces using simple techniques leveraging the compositionality in GANs. However, identity preserving editing for 3D faces with a given set of attributes is a challenging task as the generative ...

ID: 2510.18287v1 cs.CV, cs.LG

arXiv PDF

📄 Vision Foundation Models Can Be Good Tokenizers for Latent Diffusion Models

2025-10-23

Авторы:

Tianci Bi, Xiaoyi Zhang, Yan Lu, Nanning Zheng

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The performance of Latent Diffusion Models (LDMs) is critically dependent on the quality of their visual tokenizer. While recent works have explored incorporating Vision Foundation Models (VFMs) via distillation, we identify a fundamental flaw in this approach: it inevitably weakens the robustness of alignment with the original VFM, causing the aligned latents to deviate semantically under distribution shifts. In this paper, we bypass distillation by proposing a more direct approach: Vision Foun...

ID: 2510.18457v1 cs.CV, cs.LG

arXiv PDF

📄 CovMatch: Cross-Covariance Guided Multimodal Dataset Distillation with Trainable Text Encoder

2025-10-23

Авторы:

Yongmin Lee, Hye Won Chung

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multimodal dataset distillation aims to synthesize a small set of image-text pairs that enables efficient training of large-scale vision-language models. While dataset distillation has shown promise in unimodal tasks, extending it to multimodal contrastive learning presents key challenges: learning cross-modal alignment and managing the high computational cost of large encoders. Prior approaches address scalability by freezing the text encoder and update only the image encoder and text projectio...

ID: 2510.18583v1 cs.CV, cs.LG

arXiv PDF

📄 FST.ai 2.0: An Explainable AI Ecosystem for Fair, Fast, and Inclusive Decision-Making in Olympic and Paralympic Taekwondo

2025-10-23

Авторы:

Keivan Shariatmadar, Ahmad Osman, Ramin Ray, Kisam Kim

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Fair, transparent, and explainable decision-making remains a critical challenge in Olympic and Paralympic combat sports. This paper presents \emph{FST.ai 2.0}, an explainable AI ecosystem designed to support referees, coaches, and athletes in real time during Taekwondo competitions and training. The system integrates {pose-based action recognition} using graph convolutional networks (GCNs), {epistemic uncertainty modeling} through credal sets, and {explainability overlays} for visual decision su...

ID: 2510.18193v2 cs.AI, cs.CV, cs.LG, stat.ML, 68T01, I.2.8

arXiv PDF

Показано 301 - 310 из 863 записей