📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models

2025-10-11

Авторы:

Sharut Gupta, Shobhita Sundaram, Chenyu Wang, Stefanie Jegelka, Phillip Isola

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Traditional multimodal learners find unified representations for tasks like visual question answering, but rely heavily on paired datasets. However, an overlooked yet potentially powerful question is: can one leverage auxiliary unpaired multimodal data to directly enhance representation learning in a target modality? We introduce UML: Unpaired Multimodal Learner, a modality-agnostic training paradigm in which a single model alternately processes inputs from different modalities while sharing par...

ID: 2510.08492v1 cs.LG, cs.CV

arXiv PDF

📄 On knot detection via picture recognition

2025-10-10

Авторы:

Anne Dranowski, Yura Kabkov, Daniel Tubbenhauer

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Our goal is to one day take a photo of a knot and have a phone automatically recognize it. In this expository work, we explain a strategy to approximate this goal, using a mixture of modern machine learning methods (in particular convolutional neural networks and transformers for image recognition) and traditional algorithms (to compute quantum invariants like the Jones polynomial). We present simple baselines that predict crossing number directly from images, showing that even lightweight CNN a...

ID: 2510.06284v1 cs.LG, cs.CV, math.GT, Primary: 57K10, 68T07, secondary: 57K14, 68T45

arXiv PDF

📄 StruSR: Structure-Aware Symbolic Regression with Physics-Informed Taylor Guidance

2025-10-10

Авторы:

Yunpeng Gong, Sihan Lan, Can Yang, Kunpeng Xu, Min Jiang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Symbolic regression aims to find interpretable analytical expressions by searching over mathematical formula spaces to capture underlying system behavior, particularly in scientific modeling governed by physical laws. However, traditional methods lack mechanisms for extracting structured physical priors from time series observations, making it difficult to capture symbolic expressions that reflect the system's global behavior. In this work, we propose a structure-aware symbolic regression framew...

ID: 2510.06635v1 cs.LG, cs.CV

arXiv PDF

📄 SaFeR-VLM: Toward Safety-aware Fine-grained Reasoning in Multimodal Models

2025-10-10

Авторы:

Huahui Yi, Kun Wang, Qiankun Li, Miao Yu, Liang Lin, Gongli Xi, Hao Wu, Xuming Hu, Kang Li, Yang Liu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multimodal Large Reasoning Models (MLRMs) demonstrate impressive cross-modal reasoning but often amplify safety risks under adversarial or unsafe prompts, a phenomenon we call the \textit{Reasoning Tax}. Existing defenses mainly act at the output level and do not constrain the reasoning process, leaving models exposed to implicit risks. In this paper, we propose SaFeR-VLM, a safety-aligned reinforcement learning framework that embeds safety directly into multimodal reasoning. The framework integ...

ID: 2510.06871v2 cs.LG, cs.CV

arXiv PDF

📄 High-Rate Mixout: Revisiting Mixout for Robust Domain Generalization

2025-10-10

Авторы:

Masih Aminbeidokhti, Heitor Rapela Medeiros, Eric Granger, Marco Pedersoli

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Ensembling fine-tuned models initialized from powerful pre-trained weights is a common strategy to improve robustness under distribution shifts, but it comes with substantial computational costs due to the need to train and store multiple models. Dropout offers a lightweight alternative by simulating ensembles through random neuron deactivation; however, when applied to pre-trained models, it tends to over-regularize and disrupt critical representations necessary for generalization. In this work...

ID: 2510.06955v1 cs.LG, cs.CV

arXiv PDF

📄 Revisiting Mixout: An Overlooked Path to Robust Finetuning

2025-10-10

Авторы:

Masih Aminbeidokhti, Heitor Rapela Medeiros, Eric Granger, Marco Pedersoli

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Finetuning vision foundation models often improves in-domain accuracy but comes at the cost of robustness under distribution shift. We revisit Mixout, a stochastic regularizer that intermittently replaces finetuned weights with their pretrained reference, through the lens of a single-run, weight-sharing implicit ensemble. This perspective reveals three key levers that govern robustness: the \emph{masking anchor}, \emph{resampling frequency}, and \emph{mask sparsity}. Guided by this analysis, we ...

ID: 2510.06982v1 cs.LG, cs.CV

arXiv PDF

📄 Sharpness-Aware Data Generation for Zero-shot Quantization

2025-10-10

Авторы:

Dung Hoang-Anh, Cuong Pham Trung Le, Jianfei Cai, Thanh-Toan Do

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Zero-shot quantization aims to learn a quantized model from a pre-trained full-precision model with no access to original real training data. The common idea in zero-shot quantization approaches is to generate synthetic data for quantizing the full-precision model. While it is well-known that deep neural networks with low sharpness have better generalization ability, none of the previous zero-shot quantization works considers the sharpness of the quantized model as a criterion for generating tra...

ID: 2510.07018v1 cs.LG, cs.CV

arXiv PDF

📄 RegMix: Adversarial Mutual and Generalization Regularization for Enhancing DNN Robustness

2025-10-09

Авторы:

Zhenyu Liu, Varun Ojha

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Adversarial training is the most effective defense against adversarial attacks. The effectiveness of the adversarial attacks has been on the design of its loss function and regularization term. The most widely used loss function in adversarial training is cross-entropy and mean squared error (MSE) as its regularization objective. However, MSE enforces overly uniform optimization between two output distributions during training, which limits its robustness in adversarial training scenarios. To ad...

ID: 2510.05317v1 cs.LG, cs.CV

arXiv PDF

📄 NEO: No-Optimization Test-Time Adaptation through Latent Re-Centering

2025-10-09

Авторы:

Alexander Murphy, Michal Danilowski, Soumyajit Chatterjee, Abhirup Ghosh

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Test-Time Adaptation (TTA) methods are often computationally expensive, require a large amount of data for effective adaptation, or are brittle to hyperparameters. Based on a theoretical foundation of the geometry of the latent space, we are able to significantly improve the alignment between source and distribution-shifted samples by re-centering target data embeddings at the origin. This insight motivates NEO -- a hyperparameter-free fully TTA method, that adds no significant compute compared ...

ID: 2510.05635v1 cs.LG, cs.CV

arXiv PDF

📄 Neighborhood-Adaptive Generalized Linear Graph Embedding with Latent Pattern Mining

2025-10-09

Авторы:

S. Peng, L. Hu, W. Zhang, B. Jie, Y. Luo

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Graph embedding has been widely applied in areas such as network analysis, social network mining, recommendation systems, and bioinformatics. However, current graph construction methods often require the prior definition of neighborhood size, limiting the effective revelation of potential structural correlations in the data. Additionally, graph embedding methods using linear projection heavily rely on a singular pattern mining approach, resulting in relative weaknesses in adapting to different s...

ID: 2510.05719v1 cs.LG, cs.CV

arXiv PDF

Показано 131 - 140 из 277 записей