📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Automated Dynamic AI Inference Scaling on HPC-Infrastructure: Integrating Kubernetes, Slurm and vLLM

2025-11-27

Авторы:

Tim Trappen, Robert Keßler, Roland Pabel, Viktor Achter, Stefan Wesner

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Due to rising demands for Artificial Inteligence (AI) inference, especially in higher education, novel solutions utilising existing infrastructure are emerging. The utilisation of High-Performance Computing (HPC) has become a prevalent approach for the implementation of such solutions. However, the classical operating model of HPC does not adapt well to the requirements of synchronous, user-facing dynamic AI application workloads. In this paper, we propose our solution that serves LLMs by integr...

ID: 2511.21413v1 cs.DC, cs.AI, cs.DB, cs.PF

arXiv PDF

📄 Flash-Fusion: Enabling Expressive, Low-Latency Queries on IoT Sensor Streams with LLMs

2025-11-19

Авторы:

Kausar Patherya, Ashutosh Dhekne, Francisco Romero

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Smart cities and pervasive IoT deployments have generated interest in IoT data analysis across transportation and urban planning. At the same time, Large Language Models offer a new interface for exploring IoT data - particularly through natural language. Users today face two key challenges when working with IoT data using LLMs: (1) data collection infrastructure is expensive, producing terabytes of low-level sensor readings that are too granular for direct use, and (2) data analysis is slow, re...

ID: 2511.11885v1 cs.DC, cs.AI, cs.DB

arXiv PDF

📄 LLM Agents for Interactive Workflow Provenance: Reference Architecture and Evaluation Methodology

2025-09-19

Авторы:

Renan Souza, Timothy Poteet, Brian Etz, Daniel Rosendo, Amal Gueroudji, Woong Shin, Prasanna Balaprakash, Rafael Ferreira da Silva

## Контекст In modern scientific discovery, workflows spanning the Edge, Cloud, and High Performance Computing (HPC) continuum are crucial for processing and analyzing data. These workflows enable hypothesis validation, anomaly detection, reproducibility, and impactful findings. However, as workflows scale, provenance data—essential for understanding and analyzing these processes—become increasingly complex. Current systems rely on custom scripts, structured queries, or static dashboards, which limit interactivity and flexibility. This complexity hinders effective data exploration and analysis. To address this challenge, researchers are exploring interactive approaches leveraging Large Language Models (LLMs). These models offer potential for transforming how provenance data are accessed and analyzed, enabling more intuitive and efficient workflows. By integrating LLM agents into provenance systems, the goal is to provide researchers with a more interactive and insightful experience, overcoming the limitations of existing methods. This work aims to define a reference architecture and evaluation methodology for such systems. ## Метод The proposed methodology combines a reference architecture and an evaluation framework for interactive provenance analysis using LLM agents. The reference architecture is lightweight and metadata-driven, translating natural language queries into structured provenance queries. It integrates Retrieval-Augmented Generation (RAG) to enhance LLM responses with contextual metadata. Key components include: 1. **Metadata-driven design**: A structured schema translates natural language into provenance queries. 2. **LLM agent integration**: LLMs like LLaMA, GPT, Gemini, and Claude are utilized for query interpretation and response generation. 3. **Prompt tuning**: Fine-tuning prompts improves the accuracy and relevance of LLM responses. 4. **Diverse query testing**: A range of query classes, including temporal, spatial, and entity-based queries, are evaluated. 5. **Real-world evaluation**: The methodology is tested on a chemistry workflow, showcasing practical applicability. This modular and scalable approach ensures that the system can adapt to various scientific workflows while maintaining accuracy and usability. ## Результаты Evaluations were conducted using LLaMA, GPT, Gemini, and Claude LLMs across multiple query classes and a real-world chemistry workflow. The results demonstrate the following: 1. **Accuracy**: LLM agents achieved high accuracy in interpreting natural language queries and generating structured provenance queries. 2. **Query diversity**: The system performed well across temporal, spatial, and entity-based queries, showcasing its versatility. 3. **Comparison with baselines**: LLM-based approaches outperformed traditional methods, such as static dashboards and structured queries, in terms of interactivity and depth of analysis. 4. **Scalability**: The metadata-driven design proved scalable, handling large-scale provenance data efficiently. The open-source implementation provides a blueprint for integrating LLM agents into existing provenance systems, offering a practical solution for enhancing workflow provenance analysis. ## Значимость The proposed approach has significant implications across multiple domains: 1. **Scientific research**: Enables more interactive and insightful analysis of workflow provenance, supporting hypothesis validation and reproducibility. 2. **Data-intensive applications**: Facilitates complex data exploration in fields such as chemistry, biology, and environmental science. 3. **Real-world impact**: The modular design and open-source nature allow for easy adoption and customization across different scientific and industrial workflows. The integration of LLM agents represents a paradigm shift in provenance analysis, offering a more intuitive and powerful alternative to traditional methods. The potential for broader adoption is high, given the growing demand for interactive and scalable data analysis tools. ## Выводы The research introduces a reference architecture and evaluation methodology for leveraging LLM agents in interactive workflow provenance analysis. Key achievements include: 1. Demonstration of the feasibility and effectiveness of LLM-based approaches in provenance analysis. 2. Development of a modular and scalable design that enhances interactivity and accuracy. 3. Practical evaluation across diverse query classes and a real-world workflow, showcasing the system's potential. Future work will focus on expanding the scope of query types, improving LLM prompt tuning, and exploring additional scientific domains for broader applicability. This work lays the foundation for transformative advancements in scientific data analysis and workflow provenance.

Annotation:

Modern scientific discovery increasingly relies on workflows that process data across the Edge, Cloud, and High Performance Computing (HPC) continuum. Comprehensive and in-depth analyses of these data are critical for hypothesis validation, anomaly detection, reproducibility, and impactful findings. Although workflow provenance techniques support such analyses, at large scale, the provenance data become complex and difficult to analyze. Existing systems depend on custom scripts, structured queri...

ID: 2509.13978v1 cs.DC, cs.AI, cs.DB, 68M14, 68M20, 68T07, C.2.4; D.1.3; I.2.0

arXiv PDF