📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Software Engineering Agents for Embodied Controller Generation : A Study in Minigrid Environments

2025-10-29

Авторы:

Timothé Boulet, Xavier Hinaut, Clément Moulin-Frier

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Software Engineering Agents (SWE-Agents) have proven effective for traditional software engineering tasks with accessible codebases, but their performance for embodied tasks requiring well-designed information discovery remains unexplored. We present the first extended evaluation of SWE-Agents on controller generation for embodied tasks, adapting Mini-SWE-Agent (MSWEA) to solve 20 diverse embodied tasks from the Minigrid environment. Our experiments compare agent performance across different inf...

ID: 2510.21902v1 cs.SE, cs.AI

arXiv PDF

📄 TOM-SWE: User Mental Modeling For Software Engineering Agents

2025-10-29

Авторы:

Xuhui Zhou, Valerie Chen, Zora Zhiruo Wang, Graham Neubig, Maarten Sap, Xingyao Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Recent advances in coding agents have made them capable of planning, editing, running, and testing complex code bases. Despite their growing ability in coding tasks, these systems still struggle to infer and track user intent, especially when instructions are underspecified or context-dependent. To bridge this gap, we introduce ToM-SWE, a dual-agent architecture that pairs a primary software-engineering (SWE) agent with a lightweight theory-of-mind (ToM) partner agent dedicated to modeling the u...

ID: 2510.21903v1 cs.SE, cs.AI

arXiv PDF

📄 A Comparison of Conversational Models and Humans in Answering Technical Questions: the Firefox Case

2025-10-29

Авторы:

Joao Correia, Daniel Coutinho, Marco Castelluccio, Caio Barbosa, Rafael de Mello, Anita Sarma, Alessandro Garcia, Marco Gerosa, Igor Steinmacher

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The use of Large Language Models (LLMs) to support tasks in software development has steadily increased over recent years. From assisting developers in coding activities to providing conversational agents that answer newcomers' questions. In collaboration with the Mozilla Foundation, this study evaluates the effectiveness of Retrieval-Augmented Generation (RAG) in assisting developers within the Mozilla Firefox project. We conducted an empirical analysis comparing responses from human developers...

ID: 2510.21933v1 cs.SE, cs.AI

arXiv PDF

📄 ArchISMiner: A Framework for Automatic Mining of Architectural Issue-Solution Pairs from Online Developer Communities

2025-10-29

Авторы:

Musengamana Jean de Dieu, Ruiyin Li, Peng Liang, Mojtaba Shahin, Muhammad Waseem, Arif Ali Khan, Bangchao Wang, Mst Shamima Aktar

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Stack Overflow (SO), a leading online community forum, is a rich source of software development knowledge. However, locating architectural knowledge, such as architectural solutions remains challenging due to the overwhelming volume of unstructured content and fragmented discussions. Developers must manually sift through posts to find relevant architectural insights, which is time-consuming and error-prone. This study introduces ArchISMiner, a framework for mining architectural knowledge from SO...

ID: 2510.21966v1 cs.SE, cs.AI

arXiv PDF

📄 Impact and Implications of Generative AI for Enterprise Architects in Agile Environments: A Systematic Literature Review

2025-10-29

Авторы:

Stefan Julian Kooy, Jean Paul Sebastian Piest, Rob Henk Bemthuis

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Generative AI (GenAI) is reshaping enterprise architecture work in agile software organizations, yet evidence on its effects remains scattered. We report a systematic literature review (SLR), following established SLR protocols of Kitchenham and PRISMA, of 1,697 records, yielding 33 studies across enterprise, solution, domain, business, and IT architect roles. GenAI most consistently supports (i) design ideation and trade-off exploration; (ii) rapid creation and refinement of artifacts (e.g., co...

ID: 2510.22003v1 cs.SE, cs.AI

arXiv PDF

📄 LSPRAG: LSP-Guided RAG for Language-Agnostic Real-Time Unit Test Generation

2025-10-29

Авторы:

Gwihwan Go, Quan Zhang, Chijin Zhou, Zhao Wei, Yu Jiang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Automated unit test generation is essential for robust software development, yet existing approaches struggle to generalize across multiple programming languages and operate within real-time development. While Large Language Models (LLMs) offer a promising solution, their ability to generate high coverage test code depends on prompting a concise context of the focal method. Current solutions, such as Retrieval-Augmented Generation, either rely on imprecise similarity-based searches or demand the...

ID: 2510.22210v1 cs.SE, cs.AI, D.2.5

arXiv PDF

📄 Taming Silent Failures: A Framework for Verifiable AI Reliability

2025-10-29

Авторы:

Guan-Yan Yang, Farn Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The integration of Artificial Intelligence (AI) into safety-critical systems introduces a new reliability paradigm: silent failures, where AI produces confident but incorrect outputs that can be dangerous. This paper introduces the Formal Assurance and Monitoring Environment (FAME), a novel framework that confronts this challenge. FAME synergizes the mathematical rigor of offline formal synthesis with the vigilance of online runtime monitoring to create a verifiable safety net around opaque AI c...

ID: 2510.22224v1 cs.SE, cs.AI, cs.LG, cs.LO, cs.SY, eess.SY

arXiv PDF

📄 Harnessing the Power of Large Language Models for Software Testing Education: A Focus on ISTQB Syllabus

2025-10-29

Авторы:

Tuan-Phong Ngo, Bao-Ngoc Duong, Tuan-Anh Hoang, Joshua Dwight, Ushik Shrestha Khwakhali

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Software testing is a critical component in the software engineering field and is important for software engineering education. Thus, it is vital for academia to continuously improve and update educational methods to reflect the current state of the field. The International Software Testing Qualifications Board (ISTQB) certification framework is globally recognized and widely adopted in industry and academia. However, ISTQB-based learning has been rarely applied with recent generative artificial...

ID: 2510.22318v1 cs.SE, cs.AI, K.3.2, D.2.5

arXiv PDF

📄 Does In-IDE Calibration of Large Language Models work at Scale?

2025-10-29

Авторы:

Roham Koohestani, Agnia Sergeyuk, David Gros, Claudio Spiess, Sergey Titov, Prem Devanbu, Maliheh Izadi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The introduction of large language models into integrated development environments (IDEs) is revolutionizing software engineering, yet it poses challenges to the usefulness and reliability of Artificial Intelligence-generated code. Post-hoc calibration of internal model confidences aims to align probabilities with an acceptability measure. Prior work suggests calibration can improve alignment, but at-scale evidence is limited. In this work, we investigate the feasibility of applying calibration ...

ID: 2510.22614v1 cs.SE, cs.AI

arXiv PDF

📄 Collaborative LLM Agents for C4 Software Architecture Design Automation

2025-10-29

Авторы:

Kamil Szczepanik, Jarosław A. Chudziak

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Software architecture design is a fundamental part of creating every software system. Despite its importance, producing a C4 software architecture model, the preferred notation for such architecture, remains manual and time-consuming. We introduce an LLM-based multi-agent system that automates this task by simulating a dialogue between role-specific experts who analyze requirements and generate the Context, Container, and Component views of the C4 model. Quality is assessed with a hybrid evaluat...

ID: 2510.22787v1 cs.SE, cs.AI, 68T07, I.2.11; I.2.7; I.2.8

arXiv PDF

Показано 111 - 120 из 341 записей