📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Beyond Postconditions: Can Large Language Models infer Formal Contracts for Automatic Software Verification?

2025-10-16

Авторы:

Cedric Richter, Heike Wehrheim

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Automatic software verifiers have become increasingly effective at the task of checking software against (formal) specifications. Yet, their adoption in practice has been hampered by the lack of such specifications in real world code. Large Language Models (LLMs) have shown promise in inferring formal postconditions from natural language hints embedded in code such as function names, comments or documentation. Using the generated postconditions as specifications in a subsequent verification, how...

ID: 2510.12702v1 cs.SE, cs.AI, cs.PL

arXiv PDF

📄 Operationalizing AI: Empirical Evidence on MLOps Practices, User Satisfaction, and Organizational Context

2025-10-15

Авторы:

Stefan Pasch

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Organizational efforts to utilize and operationalize artificial intelligence (AI) are often accompanied by substantial challenges, including scalability, maintenance, and coordination across teams. In response, the concept of Machine Learning Operations (MLOps) has emerged as a set of best practices that integrate software engineering principles with the unique demands of managing the ML lifecycle. Yet, empirical evidence on whether and how these practices support users in developing and operati...

ID: 2510.09968v1 cs.SE, cs.AI, cs.CL, cs.HC, cs.LG

arXiv PDF

📄 Cracking CodeWhisperer: Analyzing Developers' Interactions and Patterns During Programming Tasks

2025-10-15

Авторы:

Jeena Javahar, Tanya Budhrani, Manaal Basha, Cleidson R. B. de Souza, Ivan Beschastnikh, Gema Rodriguez-Perez

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The use of AI code-generation tools is becoming increasingly common, making it important to understand how software developers are adopting these tools. In this study, we investigate how developers engage with Amazon's CodeWhisperer, an LLM-based code-generation tool. We conducted two user studies with two groups of 10 participants each, interacting with CodeWhisperer - the first to understand which interactions were critical to capture and the second to collect low-level interaction data using ...

ID: 2510.11516v1 cs.SE, cs.AI

arXiv PDF

📄 CodeWatcher: IDE Telemetry Data Extraction Tool for Understanding Coding Interactions with LLMs

2025-10-15

Авторы:

Manaal Basha, Aimeê M. Ribeiro, Jeena Javahar, Cleidson R. B. de Souza, Gema Rodríguez-Pérez

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Understanding how developers interact with code generation tools (CGTs) requires detailed, real-time data on programming behavior which is often difficult to collect without disrupting workflow. We present \textit{CodeWatcher}, a lightweight, unobtrusive client-server system designed to capture fine-grained interaction events from within the Visual Studio Code (VS Code) editor. \textit{CodeWatcher} logs semantically meaningful events such as insertions made by CGTs, deletions, copy-paste actions...

ID: 2510.11536v1 cs.SE, cs.AI

arXiv PDF

📄 BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

2025-10-14

Авторы:

Terry Yue Zhuo, Xiaolong Jin, Hange Liu, Juyong Jiang, Tianyang Liu, Chen Gong, Bhupesh Bishnoi, Vaisakhi Mishra, Marek Suppa, Noah Ziems, Saiteja Utpala, Ming Xu, Guangyu Song, Kaixin Li, Yuhan Cao, Bo Liu, Zheng Liu, Sabina Abdurakhmanova, Wenhao Yu, Mengzhao Jia, Jihan Yao, Kenneth Hamilton, Kumar Shridhar, Minh Chien Vu, Dingmin Wang, Jiawei Liu, Zijian Wang, Qian Liu, Binyuan Hui, Meg Risdal, Ahsen Khaliq, Atin Sood, Zhenchang Xing, Wasi Uddin Ahmad, John Grundy, David Lo, Banghua Zhu, Xiaoning Du, Torsten Scholak, Leandro von Werra

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Crowdsourced model evaluation platforms, such as Chatbot Arena, enable real-time evaluation from human perspectives to assess the quality of model responses. In the coding domain, manually examining the quality of LLM-generated content is extremely challenging, as it requires understanding long chunks of raw code and deliberately simulating code execution. To this end, we introduce BigCodeArena, an open human evaluation platform for code generation backed by a comprehensive and on-the-fly execut...

ID: 2510.08697v1 cs.SE, cs.AI, cs.CL

arXiv PDF

📄 McMining: Automated Discovery of Misconceptions in Student Code

2025-10-14

Авторы:

Erfan Al-Hossami, Razvan Bunescu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

When learning to code, students often develop misconceptions about various programming language concepts. These can not only lead to bugs or inefficient code, but also slow down the learning of related concepts. In this paper, we introduce McMining, the task of mining programming misconceptions from samples of code from a student. To enable the training and evaluation of McMining systems, we develop an extensible benchmark dataset of misconceptions together with a large set of code samples where...

ID: 2510.08827v1 cs.SE, cs.AI, cs.CL, cs.CY

arXiv PDF

📄 SEER: Sustainability Enhanced Engineering of Software Requirements

2025-10-14

Авторы:

Mandira Roy, Novarun Deb, Nabendu Chaki, Agostino Cortesi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The rapid expansion of software development has significant environmental, technical, social, and economic impacts. Achieving the United Nations Sustainable Development Goals by 2030 compels developers to adopt sustainable practices. Existing methods mostly offer high-level guidelines, which are time-consuming to implement and rely on team adaptability. Moreover, they focus on design or implementation, while sustainability assessment should start at the requirements engineering phase. In this pa...

ID: 2510.08981v1 cs.SE, cs.AI

arXiv PDF

📄 Saving SWE-Bench: A Benchmark Mutation Approach for Realistic Agent Evaluation

2025-10-14

Авторы:

Spandan Garg, Ben Steenhoek, Yufan Huang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Current benchmarks for evaluating software engineering agents, such as SWE-Bench Verified, are predominantly derived from GitHub issues and fail to accurately reflect how developers interact with chat-based coding assistants in integrated development environments (IDEs). We posit that this mismatch leads to a systematic overestimation of agent's capabilities in real-world scenarios, especially bug fixing. We introduce a novel benchmarking framework that transforms existing formal benchmarks into...

ID: 2510.08996v1 cs.SE, cs.AI

arXiv PDF

📄 Cost-Efficient Long Code Translation using LLMs while Leveraging Identifier Replacements

2025-10-14

Авторы:

Manojit Chakraborty, Madhusudan Ghosh, Rishabh Gupta

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In the domain of software development, LLMs have been utilized to automate tasks such as code translation, where source code from one programming language is translated to another while preserving its functionality. However, LLMs often struggle with long source codes that don't fit into the context window, which produces inaccurate translations. To address this, we propose a novel zero-shot code translation method that incorporates identifier replacement. By substituting user-given long identifi...

ID: 2510.09045v1 cs.SE, cs.AI, cs.IR, cs.LG

arXiv PDF

📄 A Model-Driven Engineering Approach to AI-Powered Healthcare Platforms

2025-10-14

Авторы:

Mira Raheem, Amal Elgammal, Michael Papazoglou, Bernd Krämer, Neamat El-Tazi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Artificial intelligence (AI) has the potential to transform healthcare by supporting more accurate diagnoses and personalized treatments. However, its adoption in practice remains constrained by fragmented data sources, strict privacy rules, and the technical complexity of building reliable clinical systems. To address these challenges, we introduce a model driven engineering (MDE) framework designed specifically for healthcare AI. The framework relies on formal metamodels, domain-specific langu...

ID: 2510.09308v1 cs.SE, cs.AI, cs.LG

arXiv PDF

Показано 151 - 160 из 341 записей