📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Bridging the Skills Gap: A Course Model for Modern Generative AI Education

2025-11-18

Авторы:

Anya Bardach, Hamilton Murrah

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Research on how the popularization of generative Artificial Intelligence (AI) tools impacts learning environments has led to hesitancy among educators to teach these tools in classrooms, creating two observed disconnects. Generative AI competency is increasingly valued in industry but not in higher education, and students are experimenting with generative AI without formal guidance. The authors argue students across fields must be taught to responsibly and expertly harness the potential of AI to...

ID: 2511.11757v1 cs.CY, cs.AI, cs.SE

arXiv PDF

📄 HPCAgentTester: A Multi-Agent LLM Approach for Enhanced HPC Unit Test Generation

2025-11-17

Авторы:

Rabimba Karanjai, Lei Xu, Weidong Shi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Unit testing in High-Performance Computing (HPC) is critical but challenged by parallelism, complex algorithms, and diverse hardware. Traditional methods often fail to address non-deterministic behavior and synchronization issues in HPC applications. This paper introduces HPCAgentTester, a novel multi-agent Large Language Model (LLM) framework designed to automate and enhance unit test generation for HPC software utilizing OpenMP and MPI. HPCAgentTester employs a unique collaborative workflow wh...

ID: 2511.10860v1 cs.DC, cs.AI, cs.SE

arXiv PDF

📄 Smarter Together: Creating Agentic Communities of Practice through Shared Experiential Learning

2025-11-15

Авторы:

Valentin Tablan, Scott Taylor, Gabriel Hurtado, Kristoffer Bernhem, Anders Uhrenholt, Gabriele Farei, Karo Moilanen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The transition from human-centric to agent-centric software development practices is disrupting existing knowledge sharing environments for software developers. Traditional peer-to-peer repositories and developer communities for shared technical knowledge and best practice have witnessed dramatic drops in participation in a short period of time. At the same time, agentic functional equivalents are yet to emerge leaving AI agents, which already generate a significant proportion of all new softwar...

ID: 2511.08301v1 cs.AI, cs.SE

arXiv PDF

📄 UA-Code-Bench: A Competitive Programming Benchmark for Evaluating LLM Code Generation in Ukrainian

2025-11-11

Авторы:

Mykyta Syromiatnikov, Victoria Ruvinskaya

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Evaluating the real capabilities of large language models in low-resource languages still represents a challenge, as many existing benchmarks focus on widespread tasks translated from English or evaluate only simple language understanding. This paper introduces UA-Code-Bench, a new open-source benchmark established for a thorough evaluation of language models' code generation and competitive programming problem-solving abilities in Ukrainian. The benchmark comprises 500 problems from the Eolymp ...

ID: 2511.05040v1 cs.CL, cs.AI, cs.SE

arXiv PDF

📄 Opus: A Quantitative Framework for Workflow Evaluation

2025-11-08

Авторы:

Alan Seroul, Théo Fagnoni, Inès Adnani, Dana O. Mohamed, Phillip Kingston

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper introduces the Opus Workflow Evaluation Framework, a probabilistic-normative formulation for quantifying Workflow quality and efficiency. It integrates notions of correctness, reliability, and cost into a coherent mathematical model that enables direct comparison, scoring, and optimization of Workflows. The framework combines the Opus Workflow Reward, a probabilistic function estimating expected performance through success likelihood, resource usage, and output gain, with the Opus Wor...

ID: 2511.04220v1 cs.AI, cs.SE

arXiv PDF

📄 AdversariaLLM: A Unified and Modular Toolbox for LLM Robustness Research

2025-11-08

Авторы:

Tim Beyer, Jonas Dornbusch, Jakob Steimle, Moritz Ladenburger, Leo Schwinn, Stephan Günnemann

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The rapid expansion of research on Large Language Model (LLM) safety and robustness has produced a fragmented and oftentimes buggy ecosystem of implementations, datasets, and evaluation methods. This fragmentation makes reproducibility and comparability across studies challenging, hindering meaningful progress. To address these issues, we introduce AdversariaLLM, a toolbox for conducting LLM jailbreak robustness research. Its design centers on reproducibility, correctness, and extensibility. The...

ID: 2511.04316v1 cs.AI, cs.SE

arXiv PDF

📄 ROSBag MCP Server: Analyzing Robot Data with LLMs for Agentic Embodied AI Applications

2025-11-07

Авторы:

Lei Fu, Sahar Salimpour, Leonardo Militano, Harry Edelman, Jorge Peña Queralta, Giovanni Toffetti

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Agentic AI systems and Physical or Embodied AI systems have been two key research verticals at the forefront of Artificial Intelligence and Robotics, with Model Context Protocol (MCP) increasingly becoming a key component and enabler of agentic applications. However, the literature at the intersection of these verticals, i.e., Agentic Embodied AI, remains scarce. This paper introduces an MCP server for analyzing ROS and ROS 2 bags, allowing for analyzing, visualizing and processing robot data wi...

ID: 2511.03497v1 cs.RO, cs.AI, cs.SE

arXiv PDF

📄 1 PoCo: Agentic Proof-of-Concept Exploit Generation for Smart Contracts

2025-11-06

Авторы:

Vivi Andersson, Sofia Bobadilla, Harald Hobbelhagen, Martin Monperrus

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Smart contracts operate in a highly adversarial environment, where vulnerabilities can lead to substantial financial losses. Thus, smart contracts are subject to security audits. In auditing, proof-of-concept (PoC) exploits play a critical role by demonstrating to the stakeholders that the reported vulnerabilities are genuine, reproducible, and actionable. However, manually creating PoCs is time-consuming, error-prone, and often constrained by tight audit schedules. We introduce POCO, an agentic...

ID: 2511.02780v1 cs.CR, cs.AI, cs.SE

arXiv PDF

📄 From product to system network challenges in system of systems lifecycle management

2025-11-04

Авторы:

Vahid Salehi, Josef Vilsmeier, Shirui Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Today, products are no longer isolated artifacts, but nodes in networked systems. This means that traditional, linearly conceived life cycle models are reaching their limits: Interoperability across disciplines, variant and configuration management, traceability, and governance across organizational boundaries are becoming key factors. This collective contribution classifies the state of the art and proposes a practical frame of reference for SoS lifecycle management, model-based systems enginee...

ID: 2510.27194v1 cs.AI, cs.SE

arXiv PDF

📄 From Queries to Insights: Agentic LLM Pipelines for Spatio-Temporal Text-to-SQL

2025-11-01

Авторы:

Manu Redd, Tao Zhe, Dongjie Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Natural-language-to-SQL (NL-to-SQL) systems hold promise for democratizing access to structured data, allowing users to query databases without learning SQL. Yet existing systems struggle with realistic spatio-temporal queries, where success requires aligning vague user phrasing with schema-specific categories, handling temporal reasoning, and choosing appropriate outputs. We present an agentic pipeline that extends a naive text-to-SQL baseline (llama-3-sqlcoder-8b) with orchestration by a Mistr...

ID: 2510.25997v1 cs.AI, cs.SE

arXiv PDF

Показано 11 - 20 из 72 записей