📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 0

Последнее обновление: сегодня

📄 Investigating Robot Control Policy Learning for Autonomous X-ray-guided Spine Procedures

2025-11-08

Авторы:

Florence Klitzner, Blanca Inigo, Benjamin D. Killeen, Lalithkumar Seenivasan, Michelle Song, Axel Krieger, Mathias Unberath

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Imitation learning-based robot control policies are enjoying renewed interest in video-based robotics. However, it remains unclear whether this approach applies to X-ray-guided procedures, such as spine instrumentation. This is because interpretation of multi-view X-rays is complex. We examine opportunities and challenges for imitation policy learning in bi-plane-guided cannula insertion. We develop an in silico sandbox for scalable, automated simulation of X-ray-guided spine procedures with a h...

ID: 2511.03882v1 cs.CV, cs.AI, cs.LG, cs.RO

arXiv PDF

📄 EvtSlowTV -- A Large and Diverse Dataset for Event-Based Depth Estimation

2025-11-07

Авторы:

Sadiq Layi Macaulay, Nimet Kaygusuz, Simon Hadfield

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Event cameras, with their high dynamic range (HDR) and low latency, offer a promising alternative for robust depth estimation in challenging environments. However, many event-based depth estimation approaches are constrained by small-scale annotated datasets, limiting their generalizability to real-world scenarios. To bridge this gap, we introduce EvtSlowTV, a large-scale event camera dataset curated from publicly available YouTube footage, which contains more than 13B events across various envi...

ID: 2511.02953v1 cs.CV, cs.AI, cs.LG, cs.RO

arXiv PDF

📄 Optimizing Earth-Moon Transfer and Cislunar Navigation: Integrating Low-Energy Trajectories, AI Techniques and GNSS-R Technologies

2025-11-07

Авторы:

Arsalan Muhammad, Wasiu Akande Ahmed, Omada Friday Ojonugwa, Paul Puspendu Biswas

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The rapid growth of cislunar activities, including lunar landings, the Lunar Gateway, and in-space refueling stations, requires advances in cost-efficient trajectory design and reliable integration of navigation and remote sensing. Traditional Earth-Moon transfers suffer from rigid launch windows and high propellant demands, while Earth-based GNSS systems provide little to no coverage beyond geostationary orbit. This limits autonomy and environmental awareness in cislunar space. This review comp...

ID: 2511.03173v1 astro-ph.EP, cs.AI, cs.LG, cs.RO

arXiv PDF

📄 A Survey on Efficient Vision-Language-Action Models

2025-11-01

Авторы:

Zhaoshu Yu, Bo Wang, Pengpeng Zeng, Haonan Zhang, Ji Zhang, Lianli Gao, Jingkuan Song, Nicu Sebe, Heng Tao Shen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Vision-Language-Action models (VLAs) represent a significant frontier in embodied intelligence, aiming to bridge digital knowledge with physical-world interaction. While these models have demonstrated remarkable generalist capabilities, their deployment is severely hampered by the substantial computational and data requirements inherent to their underlying large-scale foundation models. Motivated by the urgent need to address these challenges, this survey presents the first comprehensive review ...

ID: 2510.24795v1 cs.CV, cs.AI, cs.LG, cs.RO

arXiv PDF

📄 C-SWAP: Explainability-Aware Structured Pruning for Efficient Neural Networks Compression

2025-10-23

Авторы:

Baptiste Bauvin, Loïc Baret, Ola Ahmad

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Neural network compression has gained increasing attention in recent years, particularly in computer vision applications, where the need for model reduction is crucial for overcoming deployment constraints. Pruning is a widely used technique that prompts sparsity in model structures, e.g. weights, neurons, and layers, reducing size and inference costs. Structured pruning is especially important as it allows for the removal of entire structures, which further accelerates inference time and reduce...

ID: 2510.18636v1 cs.CV, cs.AI, cs.LG, cs.RO

arXiv PDF

📄 SilvaScenes: Tree Segmentation and Species Classification from Under-Canopy Images in Natural Forests

2025-10-14

Авторы:

David-Alexandre Duclos, William Guimont-Martin, Gabriel Jeanson, Arthur Larochelle-Tremblay, Théo Defosse, Frédéric Moore, Philippe Nolet, François Pomerleau, Philippe Giguère

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Interest in robotics for forest management is growing, but perception in complex, natural environments remains a significant hurdle. Conditions such as heavy occlusion, variable lighting, and dense vegetation pose challenges to automated systems, which are essential for precision forestry, biodiversity monitoring, and the automation of forestry equipment. These tasks rely on advanced perceptual capabilities, such as detection and fine-grained species classification of individual trees. Yet, exis...

ID: 2510.09458v1 cs.CV, cs.AI, cs.LG, cs.RO

arXiv PDF

📄 Hybrid Quantum-Classical Policy Gradient for Adaptive Control of Cyber-Physical Systems: A Comparative Study of VQC vs. MLP

2025-10-09

Авторы:

Aueaphum Aueawatthanaphisut, Nyi Wunna Tun

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The comparative evaluation between classical and quantum reinforcement learning (QRL) paradigms was conducted to investigate their convergence behavior, robustness under observational noise, and computational efficiency in a benchmark control environment. The study employed a multilayer perceptron (MLP) agent as a classical baseline and a parameterized variational quantum circuit (VQC) as a quantum counterpart, both trained on the CartPole-v1 environment over 500 episodes. Empirical results demo...

ID: 2510.06010v1 quant-ph, cs.AI, cs.LG, cs.RO, cs.SY, eess.SY

arXiv PDF

📄 Message passing-based inference in an autoregressive active inference agent

2025-10-02

Авторы:

Wouter M. Kouw, Tim N. Nisslbeck, Wouter L. N. Nuijten

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We present the design of an autoregressive active inference agent in the form of message passing on a factor graph. Expected free energy is derived and distributed across a planning graph. The proposed agent is validated on a robot navigation task, demonstrating exploration and exploitation in a continuous-valued observation space with bounded continuous-valued actions. Compared to a classical optimal controller, the agent modulates action based on predictive uncertainty, arriving later but with...

ID: 2509.25482v1 cs.AI, cs.LG, cs.RO, cs.SY, eess.SY, stat.ML

arXiv PDF

📄 TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance

2025-10-02

Авторы:

Yuyang Liu, Chuan Wen, Yihang Hu, Dinesh Jayaraman, Yang Gao

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Designing dense rewards is crucial for reinforcement learning (RL), yet in robotics it often demands extensive manual effort and lacks scalability. One promising solution is to view task progress as a dense reward signal, as it quantifies the degree to which actions advance the system toward task completion over time. We present TimeRewarder, a simple yet effective reward learning method that derives progress estimation signals from passive videos, including robot demonstrations and human videos...

ID: 2509.26627v1 cs.AI, cs.LG, cs.RO

arXiv PDF

📄 Training Agents Inside of Scalable World Models

2025-10-01

Авторы:

Danijar Hafner, Wilson Yan, Timothy Lillicrap

warmth --- title: Training Agents Inside of Scalable World Models --- ### message ## Контекст Исследование развития методов обучения агентов в контексте генерируемых моделей миров (world models) нацелено на решение проблемы точного прогнозирования динамики взаимодействия объектов в сложных средах. Традиционные world models сталкивались с ограничениями в предсказании тонких деталей взаимодействий объектов, что снижало их эффективность в хорошо контролируемых или имитационных средах. Одна из мотиваций заключается в создании агентов, способных эффективно обучаться в имитационных средах, используя видеоданные для извлечения общей значимости и дальнейшего применения этих знаний в среде взаимодействия. Такой подход может быть применен в различных задачах, включая обучение роботов, контрольных систем и симуляции графических процессов. Одной из целей исследования является создание агента, который может решать контрольные задачи в сложных игровых средах, таких как Minecraft, используя видеоданные и без необходимости динамического взаимодействия с сигналами из внешней среды. ## Метод Разработанный подход, названный Dreamer 4, является расширением предыдущих моделей, основанных на idea of world models, но реализован с учетом новых архитектур и алгоритмов. Агент обучается в имитационной среде, используя для этого архитектуру transformer, что позволяет выполнять реального времени интерпретацию входных данных. Технические решения, включая shortcut forcing objective, ориентированы на повышение точности моделирования взаимодействий объектов в сложных средах. Для обучения используются данные, полученные в предыдущих этапах работы модели, а также unlabeled videos, которые позволяют агенту извлекать общую значимость без постоянного обучения в динамической среде. Обучение производится с использованием reinforcement learning, что позволяет агенту решать контрольные задачи в имитационных средах. ## Результаты В ходе экспериментов, проведенных в сложной игровой среде Minecraft, world model Dreamer 4 показал высокую точность прогнозирования взаимодействий объектов и принципов работы среды. Этот результат оказался значительно превосходящим результаты предыдущих world models. Агент Dreamer 4 смог решать задачи, такие как получение драгоценного камня (diamond) в Minecraft, используя только unlabeled videos и без необходимости динамического взаимодействия с сигналами из внешней среды. Это сделал на основе обучения в имитационной среде, используя только небольшой объем данных для обучения и вывода. ## Значимость Область применения Dreamer 4 широка и может быть использована в различных сферах робототехники, контроля процессов и симуляции. Одним из основных преимуществ является то, что агент может быть обучен в имитационной среде, что предотвращает необходимость д

Annotation:

World models learn general knowledge from videos and simulate experience for training behaviors in imagination, offering a path towards intelligent agents. However, previous world models have been unable to accurately predict object interactions in complex environments. We introduce Dreamer 4, a scalable agent that learns to solve control tasks by reinforcement learning inside of a fast and accurate world model. In the complex video game Minecraft, the world model accurately predicts object inte...

ID: 2509.24527v1 cs.AI, cs.LG, cs.RO, stat.ML

arXiv PDF

Показано 11 - 20 из 34 записей