📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Can machines perform a qualitative data analysis? Reading the debate with Alan Turing

2025-12-05

Авторы:

Stefano De Paoli

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper reflects on the literature that rejects the use of Large Language Models (LLMs) in qualitative data analysis. It illustrates through empirical evidence as well as critical reflections why the current critical debate is focusing on the wrong problems. The paper proposes that the focus of researching the use of the LLMs for qualitative analysis is not the method per se, but rather the empirical investigation of an artificial system performing an analysis. The paper builds on the seminal...

ID: 2512.04121v1 cs.CY, cs.AI, cs.CL

arXiv PDF

📄 Culture Affordance Atlas: Reconciling Object Diversity Through Functional Mapping

2025-12-04

Авторы:

Joan Nwatu, Longju Bai, Oana Ignat, Rada Mihalcea

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Culture shapes the objects people use and for what purposes, yet mainstream Vision-Language (VL) datasets frequently exhibit cultural biases, disproportionately favoring higher-income, Western contexts. This imbalance reduces model generalizability and perpetuates performance disparities, especially impacting lower-income and non-Western communities. To address these disparities, we propose a novel function-centric framework that categorizes objects by the functions they fulfill, across diverse ...

ID: 2512.03173v1 cs.CY, cs.AI, cs.CL, cs.CV

arXiv PDF

📄 InvisibleBench: A Deployment Gate for Caregiving Relationship AI

2025-11-27

Авторы:

Ali Madad

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

InvisibleBench is a deployment gate for caregiving-relationship AI, evaluating 3-20+ turn interactions across five dimensions: Safety, Compliance, Trauma-Informed Design, Belonging/Cultural Fitness, and Memory. The benchmark includes autofail conditions for missed crises, medical advice (WOPR Act), harmful information, and attachment engineering. We evaluate four frontier models across 17 scenarios (N=68) spanning three complexity tiers. All models show significant safety gaps (11.8-44.8 percent...

ID: 2511.20733v1 cs.CY, cs.AI, cs.CL, cs.HC

arXiv PDF

📄 Large Language Models' Complicit Responses to Illicit Instructions across Socio-Legal Contexts

2025-11-27

Авторы:

Xing Wang, Huiyuan Xie, Yiyan Wang, Chaojun Xiao, Huimin Chen, Holli Sargeant, Felix Steffek, Jie Shao, Zhiyuan Liu, Maosong Sun

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) are now deployed at unprecedented scale, assisting millions of users in daily tasks. However, the risk of these models assisting unlawful activities remains underexplored. In this study, we define this high-risk behavior as complicit facilitation - the provision of guidance or support that enables illicit user instructions - and present four empirical studies that assess its prevalence in widely deployed LLMs. Using real-world legal cases and established legal framew...

ID: 2511.20736v1 cs.CY, cs.AI, cs.CL

arXiv PDF

📄 A Cross-Cultural Assessment of Human Ability to Detect LLM-Generated Fake News about South Africa

2025-11-25

Авторы:

Tim Schlippe, Matthias Wölfel, Koena Ronny Mabokela

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This study investigates how cultural proximity affects the ability to detect AI-generated fake news by comparing South African participants with those from other nationalities. As large language models increasingly enable the creation of sophisticated fake news, understanding human detection capabilities becomes crucial, particularly across different cultural contexts. We conducted a survey where 89 participants (56 South Africans, 33 from other nationalities) evaluated 10 true South African new...

ID: 2511.17682v1 cs.CY, cs.AI, cs.CL

arXiv PDF

📄 Place Matters: Comparing LLM Hallucination Rates for Place-Based Legal Queries

2025-11-15

Авторы:

Damian Curran, Vanessa Sporne, Lea Frermann, Jeannie Paterson

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

How do we make a meaningful comparison of a large language model's knowledge of the law in one place compared to another? Quantifying these differences is critical to understanding if the quality of the legal information obtained by users of LLM-based chatbots varies depending on their location. However, obtaining meaningful comparative metrics is challenging because legal institutions in different places are not themselves easily comparable. In this work we propose a methodology to obtain place...

ID: 2511.06700v1 cs.CY, cs.AI, cs.CL

arXiv PDF

📄 AI-generated podcasts: Synthetic Intimacy and Cultural Translation in NotebookLM's Audio Overviews

2025-11-15

Авторы:

Jill Walker Rettberg

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper analyses AI-generated podcasts produced by Google's NotebookLM, which generates audio podcasts with two chatty AI hosts discussing whichever documents a user uploads. While AI-generated podcasts have been discussed as tools, for instance in medical education, they have not yet been analysed as media. By uploading different types of text and analysing the generated outputs I show how the podcasts' structure is built around a fixed template. I also find that NotebookLM not only translat...

ID: 2511.08654v1 cs.CY, cs.AI, cs.CL

arXiv PDF

📄 Does GenAI Rewrite How We Write? An Empirical Study on Two-Million Preprints

2025-10-23

Авторы:

Minfeng Qi, Zhongmin Cao, Qin Wang, Ningran Li, Tianqing Zhu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Preprint repositories become central infrastructures for scholarly communication. Their expansion transforms how research is circulated and evaluated before journal publication. Generative large language models (LLMs) introduce a further potential disruption by altering how manuscripts are written. While speculation abounds, systematic evidence of whether and how LLMs reshape scientific publishing remains limited. This paper addresses the gap through a large-scale analysis of more than 2.1 mil...

ID: 2510.17882v1 cs.CY, cs.AI, cs.CL, cs.DL

arXiv PDF

📄 Are LLMs Court-Ready? Evaluating Frontier Models on Indian Legal Reasoning

2025-10-23

Авторы:

Kush Juvekar, Arghya Bhattacharya, Sai Khadloya, Utkarsh Saxena

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) are entering legal workflows, yet we lack a jurisdiction-specific framework to assess their baseline competence therein. We use India's public legal examinations as a transparent proxy. Our multi-year benchmark assembles objective screens from top national and state exams and evaluates open and frontier LLMs under real-world exam conditions. To probe beyond multiple-choice questions, we also include a lawyer-graded, paired-blinded study of long-form answers from the ...

ID: 2510.17900v1 cs.CY, cs.AI, cs.CL

arXiv PDF

📄 Interpretability Framework for LLMs in Undergraduate Calculus

2025-10-23

Авторы:

Sagnik Dakshit, Sushmita Sinha Roy

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) are increasingly being used in education, yet their correctness alone does not capture the quality, reliability, or pedagogical validity of their problem-solving behavior, especially in mathematics, where multistep logic, symbolic reasoning, and conceptual clarity are critical. Conventional evaluation methods largely focus on final answer accuracy and overlook the reasoning process. To address this gap, we introduce a novel interpretability framework for analyzing LL...

ID: 2510.17910v1 cs.CY, cs.AI, cs.CL

arXiv PDF

Показано 1 - 10 из 29 записей