📊 Статистика дайджестов

Всего дайджестов: 34123 Добавлено сегодня: 101

Последнее обновление: сегодня

📄 Geometry of Decision Making in Language Models

2025-11-26

Авторы:

Abhinav Joshi, Divyanshu Bhatt, Ashutosh Modi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) show strong generalization across diverse tasks, yet the internal decision-making processes behind their predictions remain opaque. In this work, we study the geometry of hidden representations in LLMs through the lens of \textit{intrinsic dimension} (ID), focusing specifically on decision-making dynamics in a multiple-choice question answering (MCQA) setting. We perform a large-scale study, with 28 open-weight transformer models and estimate ID across layers using m...

ID: 2511.20315v1 cs.LG, cs.AI, cs.CL

arXiv PDF

📄 Soft Adaptive Policy Optimization

2025-11-26

Авторы:

Chang Gao, Chujie Zheng, Xiong-Hui Chen, Kai Dang, Shixuan Liu, Bowen Yu, An Yang, Shuai Bai, Jingren Zhou, Junyang Lin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Reinforcement learning (RL) plays an increasingly important role in enhancing the reasoning capabilities of large language models (LLMs), yet stable and performant policy optimization remains challenging. Token-level importance ratios often exhibit high variance-a phenomenon exacerbated in Mixture-of-Experts models-leading to unstable updates. Existing group-based policy optimization methods, such as GSPO and GRPO, alleviate this problem via hard clipping, making it difficult to maintain both st...

ID: 2511.20347v1 cs.LG, cs.AI, cs.CL

arXiv PDF

📄 Short-Range Oversquashing

2025-11-26

Авторы:

Yaaqov Mishayev, Yonatan Sverdlov, Tal Amir, Nadav Dym

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Message Passing Neural Networks (MPNNs) are widely used for learning on graphs, but their ability to process long-range information is limited by the phenomenon of oversquashing. This limitation has led some researchers to advocate Graph Transformers as a better alternative, whereas others suggest that it can be mitigated within the MPNN framework, using virtual nodes or other rewiring techniques. In this work, we demonstrate that oversquashing is not limited to long-range tasks, but can also ...

ID: 2511.20406v1 cs.LG, cs.AI

arXiv PDF

📄 MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence

2025-11-25

Авторы:

Liyuan Deng, Yunpeng Bai, Yongkang Dai, Xiaoshui Huang, Hongping Gan, Dongshuo Huang, Hao jiacheng, Yilei Shi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Parametric Computer-Aided Design (CAD) is crucial in industrial applications, yet existing approaches often struggle to generate long sequence parametric commands due to complex CAD models' geometric and topological constraints. To address this challenge, we propose MamTiff-CAD, a novel CAD parametric command sequences generation framework that leverages a Transformer-based diffusion model for multi-scale latent representations. Specifically, we design a novel autoencoder that integrates Mamba+ ...

ID: 2511.17647v1 cs.LG, cs.AI

arXiv PDF

📄 Frugality in second-order optimization: floating-point approximations for Newton's method

2025-11-25

Авторы:

Giuseppe Carrino, Elena Loli Piccolomini, Elisa Riccietti, Theo Mary

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Minimizing loss functions is central to machine-learning training. Although first-order methods dominate practical applications, higher-order techniques such as Newton's method can deliver greater accuracy and faster convergence, yet are often avoided due to their computational cost. This work analyzes the impact of finite-precision arithmetic on Newton steps and establishes a convergence theorem for mixed-precision Newton optimizers, including "quasi" and "inexact" variants. The theorem provide...

ID: 2511.17660v1 cs.LG, cs.AI, math.OC

arXiv PDF

📄 AI-based framework to predict animal and pen feed intake in feedlot beef cattle

2025-11-25

Авторы:

Alex S. C. Maia, John B. Hall, Hugo F. M. Milan, Izabelle A. M. A. Teixeira

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Advances in technology are transforming sustainable cattle farming practices, with electronic feeding systems generating big longitudinal datasets on individual animal feed intake, offering the possibility for autonomous precision livestock systems. However, the literature still lacks a methodology that fully leverages these longitudinal big data to accurately predict feed intake accounting for environmental conditions. To fill this gap, we developed an AI-based framework to accurately predict f...

ID: 2511.17663v1 cs.LG, cs.AI

arXiv PDF

📄 Enhancing Adversarial Transferability through Block Stretch and Shrink

2025-11-25

Авторы:

Quan Liu, Feng Ye, Chenhao Lu, Shuming Zhen, Guanliang Huang, Lunzhe Chen, Xudong Ke

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Adversarial attacks introduce small, deliberately crafted perturbations that mislead neural networks, and their transferability from white-box to black-box target models remains a critical research focus. Input transformation-based attacks are a subfield of adversarial attacks that enhance input diversity through input transformations to improve the transferability of adversarial examples. However, existing input transformation-based attacks tend to exhibit limited cross-model transferability. P...

ID: 2511.17688v1 cs.LG, cs.AI

arXiv PDF

📄 APRIL: Annotations for Policy evaluation with Reliable Inference from LLMs

2025-11-25

Авторы:

Aishwarya Mandyam, Kalyani Limaye, Barbara E. Engelhardt, Emily Alsentzer

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Off-policy evaluation (OPE) estimates the value of a contextual bandit policy prior to deployment. As such, OPE plays a critical role in ensuring safety in high-stakes domains such as healthcare. However, standard OPE approaches are limited by the size and coverage of the behavior dataset. While previous work has explored using expert-labeled counterfactual annotations to enhance dataset coverage, obtaining such annotations is expensive, limiting the scalability of prior approaches. We propose l...

ID: 2511.17818v1 cs.LG, cs.AI

arXiv PDF

📄 Monte Carlo Expected Threat (MOCET) Scoring

2025-11-25

Авторы:

Joseph Kim, Saahith Potluri

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Evaluating and measuring AI Safety Level (ASL) threats are crucial for guiding stakeholders to implement safeguards that keep risks within acceptable limits. ASL-3+ models present a unique risk in their ability to uplift novice non-state actors, especially in the realm of biosecurity. Existing evaluation metrics, such as LAB-Bench, BioLP-bench, and WMDP, can reliably assess model uplift and domain knowledge. However, metrics that better contextualize "real-world risks" are needed to inform the s...

ID: 2511.16823v1 cs.LG, cs.AI, cs.HC

arXiv PDF

📄 A Robust Federated Learning Approach for Combating Attacks Against IoT Systems Under non-IID Challenges

2025-11-25

Авторы:

Eyad Gad, Zubair Md Fadlullah, Mostafa M. Fouda

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In the context of the growing proliferation of user devices and the concurrent surge in data volumes, the complexities arising from the substantial increase in data have posed formidable challenges to conventional machine learning model training. Particularly, this is evident within resource-constrained and security-sensitive environments such as those encountered in networks associated with the Internet of Things (IoT). Federated Learning has emerged as a promising remedy to these challenges by...

ID: 2511.16822v1 cs.LG, cs.AI, cs.CR

arXiv PDF

Показано 321 - 330 из 2912 записей