📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 MFM-point: Multi-scale Flow Matching for Point Cloud Generation

2025-11-26

Авторы:

Petr Molodyk, Jaemoo Choi, David W. Romero, Ming-Yu Liu, Yongxin Chen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In recent years, point cloud generation has gained significant attention in 3D generative modeling. Among existing approaches, point-based methods directly generate point clouds without relying on other representations such as latent features, meshes, or voxels. These methods offer low training cost and algorithmic simplicity, but often underperform compared to representation-based approaches. In this paper, we propose MFM-Point, a multi-scale Flow Matching framework for point cloud generation t...

ID: 2511.20041v1 cs.CV, cs.AI, cs.LG

arXiv PDF

📄 Uplifting Table Tennis: A Robust, Real-World Application for 3D Trajectory and Spin Estimation

2025-11-26

Авторы:

Daniel Kienzle, Katja Ludwig, Julian Lorenz, Shin'ichi Satoh, Rainer Lienhart

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Obtaining the precise 3D motion of a table tennis ball from standard monocular videos is a challenging problem, as existing methods trained on synthetic data struggle to generalize to the noisy, imperfect ball and table detections of the real world. This is primarily due to the inherent lack of 3D ground truth trajectories and spin annotations for real-world video. To overcome this, we propose a novel two-stage pipeline that divides the problem into a front-end perception task and a back-end 2D-...

ID: 2511.20250v1 cs.CV, cs.AI, cs.LG

arXiv PDF

📄 Actionable and diverse counterfactual explanations incorporating domain knowledge and causal constraints

2025-11-26

Авторы:

Szymon Bobek, Łukasz Bałec, Grzegorz J. Nalepa

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Counterfactual explanations enhance the actionable interpretability of machine learning models by identifying the minimal changes required to achieve a desired outcome of the model. However, existing methods often ignore the complex dependencies in real-world datasets, leading to unrealistic or impractical modifications. Motivated by cybersecurity applications in the email marketing domain, we propose a method for generating Diverse, Actionable, and kNowledge-Constrained Explanations (DANCE), wh...

ID: 2511.20236v1 cs.AI, cs.LG

arXiv PDF

📄 Forgetting by Pruning: Data Deletion in Join Cardinality Estimation

2025-11-26

Авторы:

Chaowei He, Yuanjun Liu, Qingzhi Ma, Shenyuan Ren, Xizhao Luo, Lei Zhao, An Liu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Machine unlearning in learned cardinality estimation (CE) systems presents unique challenges due to the complex distributional dependencies in multi-table relational data. Specifically, data deletion, a core component of machine unlearning, faces three critical challenges in learned CE models: attribute-level sensitivity, inter-table propagation and domain disappearance leading to severe overestimation in multi-way joins. We propose Cardinality Estimation Pruning (CEP), the first unlearning fram...

ID: 2511.20293v1 cs.DB, cs.AI, cs.LG

arXiv PDF

📄 NNGPT: Rethinking AutoML with Large Language Models

2025-11-26

Авторы:

Roman Kochnev, Waleed Khalid, Tolgay Atinc Uzun, Xi Zhang, Yashkumar Sanjaybhai Dhameliya, Furui Qin, Chandini Vysyaraju, Raghuvir Duvvuri, Avi Goyal, Dmitry Ignatov, Radu Timofte

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Building self-improving AI systems remains a fundamental challenge in the AI domain. We present NNGPT, an open-source framework that turns a large language model (LLM) into a self-improving AutoML engine for neural network development, primarily for computer vision. Unlike previous frameworks, NNGPT extends the dataset of neural networks by generating new models, enabling continuous fine-tuning of LLMs based on closed-loop system of generation, assessment, and self-improvement. It integrates wit...

ID: 2511.20333v1 cs.AI, cs.LG, cs.NE

arXiv PDF

📄 StableTrack: Stabilizing Multi-Object Tracking on Low-Frequency Detections

2025-11-26

Авторы:

Matvei Shelukhan, Timur Mamedov, Karina Kvanchiani

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multi-object tracking (MOT) is one of the most challenging tasks in computer vision, where it is important to correctly detect objects and associate these detections across frames. Current approaches mainly focus on tracking objects in each frame of a video stream, making it almost impossible to run the model under conditions of limited computing resources. To address this issue, we propose StableTrack, a novel approach that stabilizes the quality of tracking on low-frequency detections. Our met...

ID: 2511.20418v1 cs.CV, cs.AI, cs.LG

arXiv PDF

📄 OmniStruct: Universal Text-to-Structure Generation across Diverse Schemas

2025-11-25

Авторы:

James Y. Huang, Wenxuan Zhou, Nan Xu, Fei Wang, Qin Liu, Sheng Zhang, Hoifung Poon, Muhao Chen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The ability of Large Language Models (LLMs) to generate structured outputs that follow arbitrary schemas is crucial to a wide range of downstream tasks that require diverse structured representations of results such as information extraction, table generation, and function calling. While modern LLMs excel in generating unstructured responses in natural language, whether this advancement translates to a strong performance on text-to-structure tasks remains unclear. To bridge this gap, we first in...

ID: 2511.18335v1 cs.CL, cs.AI, cs.LG

arXiv PDF

📄 Episodic Memory in Agentic Frameworks: Suggesting Next Tasks

2025-11-25

Авторы:

Sandro Rama Fiorini, Leonardo G. Azevedo, Raphael M. Thiago, Valesca M. de Sousa, Anton B. Labate, Viviane Torres da Silva

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Agentic frameworks powered by Large Language Models (LLMs) can be useful tools in scientific workflows by enabling human-AI co-creation. A key challenge is recommending the next steps during workflow creation without relying solely on LLMs, which risk hallucination and require fine-tuning with scarce proprietary data. We propose an episodic memory architecture that stores and retrieves past workflows to guide agents in suggesting plausible next tasks. By matching current workflows with historica...

ID: 2511.17775v1 cs.MA, cs.AI, cs.LG

arXiv PDF

📄 REXO: Indoor Multi-View Radar Object Detection via 3D Bounding Box Diffusion

2025-11-25

Авторы:

Ryoma Yataka, Pu Perry Wang, Petros Boufounos, Ryuhei Takahashi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multi-view indoor radar perception has drawn attention due to its cost-effectiveness and low privacy risks. Existing methods often rely on {implicit} cross-view radar feature association, such as proposal pairing in RFMask or query-to-feature cross-attention in RETR, which can lead to ambiguous feature matches and degraded detection in complex indoor scenes. To address these limitations, we propose \textbf{REXO} (multi-view Radar object dEtection with 3D bounding boX diffusiOn), which lifts the ...

ID: 2511.17806v1 cs.CV, cs.AI, cs.LG, eess.SP

arXiv PDF

📄 Point of Order: Action-Aware LLM Persona Modeling for Realistic Civic Simulation

2025-11-25

Авторы:

Scott Merrill, Shashank Srivastava

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models offer opportunities to simulate multi-party deliberation, but realistic modeling remains limited by a lack of speaker-attributed data. Transcripts produced via automatic speech recognition (ASR) assign anonymous speaker labels (e.g., Speaker_1), preventing models from capturing consistent human behavior. This work introduces a reproducible pipeline to transform public Zoom recordings into speaker-attributed transcripts with metadata like persona profiles and pragmatic actio...

ID: 2511.17813v1 cs.CL, cs.AI, cs.LG, cs.SD

arXiv PDF

Показано 181 - 190 из 1687 записей