📊 Статистика дайджестов

Всего дайджестов: 34123 Добавлено сегодня: 101

Последнее обновление: сегодня

📄 Enhancing LLM Code Generation Capabilities through Test-Driven Development and Code Interpreter

2025-11-19

Авторы:

Sajed Jalil, Shuvo Saha, Hossain Mohammad Seym

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Over the past few years, improving LLM code generation capabilities has been a key focus in NLP research. Despite Bengali having 242 million native speakers worldwide, it receives little attention when it comes to training LLMs. More recently, various fine-tuning and augmented generation techniques have been employed to significantly enhance code generation performance. However, they require considerable expertise and resources to utilize effectively as an end user. The goal of our work is to de...

ID: 2511.12823v1 cs.SE, cs.LG, cs.PL

arXiv PDF

📄 Architecting software monitors for control-flow anomaly detection through large language models and conformance checking

2025-11-18

Авторы:

Francesco Vitale, Francesco Flammini, Mauro Caporuscio, Nicola Mazzocca

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Context: Ensuring high levels of dependability in modern computer-based systems has become increasingly challenging due to their complexity. Although systems are validated at design time, their behavior can be different at run-time, possibly showing control-flow anomalies due to "unknown unknowns". Objective: We aim to detect control-flow anomalies through software monitoring, which verifies run-time behavior by logging software execution and detecting deviations from expected control flow. ...

ID: 2511.10876v1 cs.SE, cs.LG

arXiv PDF

📄 Building Specialized Software-Assistant ChatBot with Graph-Based Retrieval-Augmented Generation

2025-11-11

Авторы:

Mohammed Hilel, Yannis Karmim, Jean De Bodinat, Reda Sarehane, Antoine Gillon

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Digital Adoption Platforms (DAPs) have become essential tools for helping employees navigate complex enterprise software such as CRM, ERP, or HRMS systems. Companies like LemonLearning have shown how digital guidance can reduce training costs and accelerate onboarding. However, building and maintaining these interactive guides still requires extensive manual effort. Leveraging Large Language Models as virtual assistants is an appealing alternative, yet without a structured understanding of the t...

ID: 2511.05297v1 cs.SE, cs.LG

arXiv PDF

📄 A Metamorphic Testing Perspective on Knowledge Distillation for Language Models of Code: Does the Student Deeply Mimic the Teacher?

2025-11-11

Авторы:

Md. Abdul Awal, Mrigank Rochan, Chanchal K. Roy

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Transformer-based language models of code have achieved state-of-the-art performance across a wide range of software analytics tasks, but their practical deployment remains limited due to high computational costs, slow inference speeds, and significant environmental impact. To address these challenges, recent research has increasingly explored knowledge distillation as a method for compressing a large language model of code (the teacher) into a smaller model (the student) while maintaining perfo...

ID: 2511.05476v1 cs.SE, cs.LG

arXiv PDF

📄 Where Do LLMs Still Struggle? An In-Depth Analysis of Code Generation Benchmarks

2025-11-08

Авторы:

Amir Molzam Sharifloo, Maedeh Heydari, Parsa Kazerooni, Daniel Maninger, Mira Mezini

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) have achieved remarkable success in code generation, and the race to improve their performance has become a central focus of AI research. Benchmarks and leaderboards are increasingly popular, offering quantitative rankings of LLMs. However, they provide limited insight into the tasks that LLMs consistently fail to solve - information that is crucial for understanding current limitations and guiding the development of more capable models. To address this gap, we exami...

ID: 2511.04355v1 cs.SE, cs.LG

arXiv PDF

📄 Understanding Robustness of Model Editing in Code LLMs: An Empirical Study

2025-11-07

Авторы:

Vinaik Chhetri, A. B Siddique, Umar Farooq

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) are increasingly used in software development. However, while LLMs remain static after pretraining, programming languages and APIs continue to evolve, leading to the generation of deprecated or incompatible code that undermines reliability. Retraining LLMs from scratch to reflect such changes is computationally expensive, making model editing a promising lightweight alternative that updates only a small subset of parameters. Despite its potential, it remains unclear ...

ID: 2511.03182v1 cs.SE, cs.LG

arXiv PDF

📄 Enhancing software product lines with machine learning components

2025-11-04

Авторы:

Luz-Viviana Cobaleda, Julián Carvajal, Paola Vallejo, Andrés López, Raúl Mazo

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Modern software systems increasingly integrate machine learning (ML) due to its advancements and ability to enhance data-driven decision-making. However, this integration introduces significant challenges for software engineering, especially in software product lines (SPLs), where managing variability and reuse becomes more complex with the inclusion of ML components. Although existing approaches have addressed variability management in SPLs and the integration of ML components in isolated syste...

ID: 2510.27640v1 cs.SE, cs.LG, D.2

arXiv PDF

📄 Automating Benchmark Design

2025-10-31

Авторы:

Amanda Dsouza, Harit Vishwakarma, Zhengyang Qi, Justin Bauer, Derek Pham, Thomas Walshe, Armin Parchami, Frederic Sala, Paroma Varma

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The rapid progress and widespread deployment of LLMs and LLM-powered agents has outpaced our ability to evaluate them. Hand-crafted, static benchmarks are the primary tool for assessing model capabilities, but these quickly become saturated. In contrast, dynamic benchmarks evolve alongside the models they evaluate, but are expensive to create and continuously update. To address these challenges, we develop BeTaL (Benchmark Tuning with an LLM-in-the-loop), a framework that leverages environment d...

ID: 2510.25039v1 cs.SE, cs.LG

arXiv PDF

📄 A Configuration-First Framework for Reproducible, Low-Code Localization

2025-10-31

Авторы:

Tim Strnad, Blaž Bertalanič, Carolina Fortuna

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Machine learning is increasingly permeating radio-based localization services. To keep results credible and comparable, everyday workflows should make rigorous experiment specification and exact repeatability the default, without blocking advanced experimentation. However, in practice, researchers face a three-way gap that could be filled by a framework that offers (i) low coding effort for end-to-end studies, (ii) reproducibility by default including versioned code, data, and configurations, co...

ID: 2510.25692v1 cs.SE, cs.LG, D.2.6; I.2.6

arXiv PDF

📄 Signature in Code Backdoor Detection, how far are we?

2025-10-19

Авторы:

Quoc Hung Le, Thanh Le-Cong, Bach Le, Bowen Xu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

As Large Language Models (LLMs) become increasingly integrated into software development workflows, they also become prime targets for adversarial attacks. Among these, backdoor attacks are a significant threat, allowing attackers to manipulate model outputs through hidden triggers embedded in training data. Detecting such backdoors remains a challenge, and one promising approach is the use of Spectral Signature defense methods that identify poisoned data by analyzing feature representations thr...

ID: 2510.13992v1 cs.SE, cs.LG

arXiv PDF

Показано 11 - 20 из 55 записей