📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 0

Последнее обновление: сегодня

📄 MP-GFormer: A 3D-Geometry-Aware Dynamic Graph Transformer Approach for Machining Process Planning

2025-11-18

Авторы:

Fatemeh Elhambakhsh, Gaurav Ameta, Aditi Roy, Hyunwoong Ko

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Machining process planning (MP) is inherently complex due to structural and geometrical dependencies among part features and machining operations. A key challenge lies in capturing dynamic interdependencies that evolve with distinct part geometries as operations are performed. Machine learning has been applied to address challenges in MP, such as operation selection and machining sequence prediction. Dynamic graph learning (DGL) has been widely used to model dynamic systems, thanks to its abilit...

ID: 2511.11837v1 cs.CV, cs.LG

arXiv PDF

📄 Enhancing XR Auditory Realism via Multimodal Scene-Aware Acoustic Rendering

2025-11-18

Авторы:

Tianyu Xu, Jihan Li, Penghe Zu, Pranav Sahay, Maruchi Kim, Jack Obeng-Marnu, Farley Miller, Xun Qian, Katrina Passarella, Mahitha Rachumalla, Rajeev Nongpiur, D. Shin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In Extended Reality (XR), rendering sound that accurately simulates real-world acoustics is pivotal in creating lifelike and believable virtual experiences. However, existing XR spatial audio rendering methods often struggle with real-time adaptation to diverse physical scenes, causing a sensory mismatch between visual and auditory cues that disrupts user immersion. To address this, we introduce SAMOSA, a novel on-device system that renders spatially accurate sound by dynamically adapting to its...

ID: 2511.11930v1 cs.HC, cs.CV, cs.LG, cs.SD

arXiv PDF

📄 A Deep Learning Framework for Thyroid Nodule Segmentation and Malignancy Classification from Ultrasound Images

2025-11-18

Авторы:

Omar Abdelrazik, Mohamed Elsayed, Noorul Wahab, Nasir Rajpoot, Adam Shephard

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Ultrasound-based risk stratification of thyroid nodules is a critical clinical task, but it suffers from high inter-observer variability. While many deep learning (DL) models function as "black boxes," we propose a fully automated, two-stage framework for interpretable malignancy prediction. Our method achieves interpretability by forcing the model to focus only on clinically relevant regions. First, a TransUNet model automatically segments the thyroid nodule. The resulting mask is then used to ...

ID: 2511.11937v1 eess.IV, cs.AI, cs.CV, cs.LG

arXiv PDF

📄 Dynamic Parameter Optimization for Highly Transferable Transformation-Based Attacks

2025-11-18

Авторы:

Jiaming Liang, Chi-Man Pun

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Despite their wide application, the vulnerabilities of deep neural networks raise societal concerns. Among them, transformation-based attacks have demonstrated notable success in transfer attacks. However, existing attacks suffer from blind spots in parameter optimization, limiting their full potential. Specifically, (1) prior work generally considers low-iteration settings, yet attacks perform quite differently at higher iterations, so characterizing overall performance based only on low-iterat...

ID: 2511.11993v1 cs.CV, cs.LG

arXiv PDF

📄 Adaptive Diagnostic Reasoning Framework for Pathology with Multimodal Large Language Models

2025-11-18

Авторы:

Yunqi Hong, Johnson Kao, Liam Edwards, Nein-Tzu Liu, Chung-Yen Huang, Alex Oliveira-Kowaleski, Cho-Jui Hsieh, Neil Y. C. Lin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

AI tools in pathology have improved screening throughput, standardized quantification, and revealed prognostic patterns that inform treatment. However, adoption remains limited because most systems still lack the human-readable reasoning needed to audit decisions and prevent errors. We present RECAP-PATH, an interpretable framework that establishes a self-learning paradigm, shifting off-the-shelf multimodal large language models from passive pattern recognition to evidence-linked diagnostic reas...

ID: 2511.12008v1 cs.AI, cs.CV, cs.LG

arXiv PDF

📄 Enhancing Road Safety Through Multi-Camera Image Segmentation with Post-Encroachment Time Analysis

2025-11-18

Авторы:

Shounak Ray Chaudhuri, Arash Jahangiri, Christopher Paolini

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Traffic safety analysis at signalized intersections is vital for reducing vehicle and pedestrian collisions, yet traditional crash-based studies are limited by data sparsity and latency. This paper presents a novel multi-camera computer vision framework for real-time safety assessment through Post-Encroachment Time (PET) computation, demonstrated at the intersection of H Street and Broadway in Chula Vista, California. Four synchronized cameras provide continuous visual coverage, with each frame ...

ID: 2511.12018v1 cs.CV, cs.LG, cs.SI

arXiv PDF

📄 Calibrated Multimodal Representation Learning with Missing Modalities

2025-11-18

Авторы:

Xiaohao Liu, Xiaobo Xia, Jiaheng Wei, Shuo Yang, Xiu Su, See-Kiong Ng, Tat-Seng Chua

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multimodal representation learning harmonizes distinct modalities by aligning them into a unified latent space. Recent research generalizes traditional cross-modal alignment to produce enhanced multimodal synergy but requires all modalities to be present for a common instance, making it challenging to utilize prevalent datasets with missing modalities. We provide theoretical insights into this issue from an anchor shift perspective. Observed modalities are aligned with a local anchor that deviat...

ID: 2511.12034v1 cs.CV, cs.LG, cs.MM

arXiv PDF

📄 BackWeak: Backdooring Knowledge Distillation Simply with Weak Triggers and Fine-tuning

2025-11-18

Авторы:

Shanmin Wang, Dongdong Zhao

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Knowledge Distillation (KD) is essential for compressing large models, yet relying on pre-trained "teacher" models downloaded from third-party repositories introduces serious security risks -- most notably backdoor attacks. Existing KD backdoor methods are typically complex and computationally intensive: they employ surrogate student models and simulated distillation to guarantee transferability, and they construct triggers in a way similar to universal adversarial perturbations (UAPs), which be...

ID: 2511.12046v1 cs.CR, cs.AI, cs.CV, cs.LG

arXiv PDF

📄 TEMPO: Global Temporal Building Density and Height Estimation from Satellite Imagery

2025-11-18

Авторы:

Tammy Glazer, Gilles Q. Hacheme, Akram Zaytar, Luana Marotti, Amy Michaels, Girmaw Abebe Tadesse, Kevin White, Rahul Dodhia, Andrew Zolli, Inbal Becker-Reshef, Juan M. Lavista Ferres, Caleb Robinson

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We present TEMPO, a global, temporally resolved dataset of building density and height derived from high-resolution satellite imagery using deep learning models. We pair building footprint and height data from existing datasets with quarterly PlanetScope basemap satellite images to train a multi-task deep learning model that predicts building density and building height at a 37.6-meter per pixel resolution. We apply this model to global PlanetScope basemaps from Q1 2018 through Q2 2025 to create...

ID: 2511.12104v1 cs.CV, cs.LG

arXiv PDF

📄 Codebook-Centric Deep Hashing: End-to-End Joint Learning of Semantic Hash Centers and Neural Hash Function

2025-11-18

Авторы:

Shuo Yin, Zhiyuan Yin, Yuqing Hou, Rui Liu, Yong Chen, Dell Zhang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Hash center-based deep hashing methods improve upon pairwise or triplet-based approaches by assigning fixed hash centers to each class as learning targets, thereby avoiding the inefficiency of local similarity optimization. However, random center initialization often disregards inter-class semantic relationships. While existing two-stage methods mitigate this by first refining hash centers with semantics and then training the hash function, they introduce additional complexity, computational ove...

ID: 2511.12162v1 cs.CV, cs.LG

arXiv PDF

Показано 151 - 160 из 835 записей