📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Making Evidence Actionable in Adaptive Learning Closing the Diagnostic Pedagogical Loop

2025-11-20

Авторы:

Amirreza Mehrabi, Jason Wade Morphew, Breejha Quezada, N. Sanjay Rebello

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Adaptive learning often diagnoses precisely yet intervenes weakly, producing help that is mistimed or misaligned. This study presents evidence supporting an instructor-governed feedback loop that converts concept-level assessment evidence into vetted microinterventions. The adaptive learning algorithm includes three safeguards: adequacy as a hard guarantee of gap closure, attention as a budgeted limit for time and redundancy, and diversity as protection against overfitting to a single resource. ...

ID: 2511.13542v2 cs.CE, cs.AI, cs.CY, stat.AP

arXiv PDF

📄 Context-aware, Ante-hoc Explanations of Driving Behaviour

2025-11-20

Авторы:

Dominik Grundt, Ishan Saxena, Malte Petersen, Bernd Westphal, Eike Möhlmann

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Autonomous vehicles (AVs) must be both safe and trustworthy to gain social acceptance and become a viable option for everyday public transportation. Explanations about the system behaviour can increase safety and trust in AVs. Unfortunately, explaining the system behaviour of AI-based driving functions is particularly challenging, as decision-making processes are often opaque. The field of Explainability Engineering tackles this challenge by developing explanation models at design time. These mo...

ID: 2511.14428v1 cs.LO, cs.AI, cs.CY

arXiv PDF

📄 A Multimodal Manufacturing Safety Chatbot: Knowledge Base Design, Benchmark Development, and Evaluation of Multiple RAG Approaches

2025-11-19

Авторы:

Ryan Singh, Austin Hamilton, Amanda White, Michael Wise, Ibrahim Yousif, Arthur Carvalho, Zhe Shan, Reza Abrisham Baf, Mohammad Mayyas, Lora A. Cavuoto, Fadel M. Megahed

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Ensuring worker safety remains a critical challenge in modern manufacturing environments. Industry 5.0 reorients the prevailing manufacturing paradigm toward more human-centric operations. Using a design science research methodology, we identify three essential requirements for next-generation safety training systems: high accuracy, low latency, and low cost. We introduce a multimodal chatbot powered by large language models that meets these design requirements. The chatbot uses retrieval-augmen...

ID: 2511.11847v1 cs.IR, cs.AI, cs.CY

arXiv PDF

📄 UpBench: A Dynamically Evolving Real-World Labor-Market Agentic Benchmark Framework Built for Human-Centric AI

2025-11-19

Авторы:

Darvin Yi, Teng Liu, Mattie Terzolo, Lance Hasson, Ayan Sinh, Pablo Mendes, Andrew Rabinovich

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

As large language model (LLM) agents increasingly undertake digital work, reliable frameworks are needed to evaluate their real-world competence, adaptability, and capacity for human collaboration. Existing benchmarks remain largely static, synthetic, or domain-limited, providing limited insight into how agents perform in dynamic, economically meaningful environments. We introduce UpBench, a dynamically evolving benchmark grounded in real jobs drawn from the global Upwork labor marketplace. Each...

ID: 2511.12306v1 cs.AI, cs.CY

arXiv PDF

📄 Auditing Google's AI Overviews and Featured Snippets: A Case Study on Baby Care and Pregnancy

2025-11-19

Авторы:

Desheng Hu, Joachim Baumann, Aleksandra Urman, Elsa Lichtenegger, Robin Forsberg, Aniko Hannak, Christo Wilson

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Google Search increasingly surfaces AI-generated content through features like AI Overviews (AIO) and Featured Snippets (FS), which users frequently rely on despite having no control over their presentation. Through a systematic algorithm audit of 1,508 real baby care and pregnancy-related queries, we evaluate the quality and consistency of these information displays. Our robust evaluation framework assesses multiple quality dimensions, including answer consistency, relevance, presence of medica...

ID: 2511.12920v1 cs.CL, cs.AI, cs.CY, cs.HC, cs.IR

arXiv PDF

📄 Making Evidence Actionable in Adaptive Learning Closing the Diagnostic Pedagogical Loop

2025-11-19

Авторы:

Amirreza Mehrabi, Jason Wade Morphew, Breejha Quezada, N. Sanjay Rebello

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

ID: 2511.13542v1 cs.CE, cs.AI, cs.CY, stat.AP

arXiv PDF

📄 Can AI Models be Jailbroken to Phish Elderly Victims? An End-to-End Evaluation

2025-11-18

Авторы:

Fred Heiding, Simon Lermen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We present an end-to-end demonstration of how attackers can exploit AI safety failures to harm vulnerable populations: from jailbreaking LLMs to generate phishing content, to deploying those messages against real targets, to successfully compromising elderly victims. We systematically evaluated safety guardrails across six frontier LLMs spanning four attack categories, revealing critical failures where several models exhibited near-complete susceptibility to certain attack vectors. In a human va...

ID: 2511.11759v1 cs.CR, cs.AI, cs.CY

arXiv PDF

📄 Reinforcing Stereotypes of Anger: Emotion AI on African American Vernacular English

2025-11-17

Авторы:

Rebecca Dorn, Christina Chance, Casandra Rusti, Charles Bickham, Kai-Wei Chang, Fred Morstatter, Kristina Lerman

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Automated emotion detection is widely used in applications ranging from well-being monitoring to high-stakes domains like mental health and hiring. However, models often rely on annotations that reflect dominant cultural norms, limiting model ability to recognize emotional expression in dialects often excluded from training data distributions, such as African American Vernacular English (AAVE). This study examines emotion recognition model performance on AAVE compared to General American English...

ID: 2511.10846v1 cs.CL, cs.AI, cs.CY

arXiv PDF

📄 Leveraging the Power of AI and Social Interactions to Restore Trust in Public Polls

2025-11-15

Авторы:

Amr Akmal Abouelmagd, Amr Hilal

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The emergence of crowdsourced data has significantly reshaped social science, enabling extensive exploration of collective human actions, viewpoints, and societal dynamics. However, ensuring safe, fair, and reliable participation remains a persistent challenge. Traditional polling methods have seen a notable decline in engagement over recent decades, raising concerns about the credibility of collected data. Meanwhile, social and peer-to-peer networks have become increasingly widespread, but data...

ID: 2511.07593v1 cs.SI, cs.AI, cs.CY

arXiv PDF

📄 Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

2025-11-15

Авторы:

Yishan Du, Conrad Borchers, Mutlu Cukurova

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

As teachers increasingly turn to GenAI in their educational practice, we need robust methods to benchmark large language models (LLMs) for pedagogical purposes. This article presents an embedding-based benchmarking framework to detect bias in LLMs in the context of formative feedback. Using 600 authentic student essays from the AES 2.0 corpus, we constructed controlled counterfactuals along two dimensions: (i) implicit cues via lexicon-based swaps of gendered terms within essays, and (ii) explic...

ID: 2511.08225v1 cs.CL, cs.AI, cs.CY, cs.HC

arXiv PDF

Показано 31 - 40 из 208 записей