📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Operationalizing AI: Empirical Evidence on MLOps Practices, User Satisfaction, and Organizational Context

2025-10-15

Авторы:

Stefan Pasch

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Organizational efforts to utilize and operationalize artificial intelligence (AI) are often accompanied by substantial challenges, including scalability, maintenance, and coordination across teams. In response, the concept of Machine Learning Operations (MLOps) has emerged as a set of best practices that integrate software engineering principles with the unique demands of managing the ML lifecycle. Yet, empirical evidence on whether and how these practices support users in developing and operati...

ID: 2510.09968v1 cs.SE, cs.AI, cs.CL, cs.HC, cs.LG

arXiv PDF

📄 Everyone prefers human writers, including AI

2025-10-14

Авторы:

Wouter Haverals, Meredith Martin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

As AI writing tools become widespread, we need to understand how both humans and machines evaluate literary style, a domain where objective standards are elusive and judgments are inherently subjective. We conducted controlled experiments using Raymond Queneau's Exercises in Style (1947) to measure attribution bias across evaluators. Study 1 compared human participants (N=556) and AI models (N=13) evaluating literary passages from Queneau versus GPT-4-generated versions under three conditions: b...

ID: 2510.08831v1 cs.AI, cs.CL, cs.HC

arXiv PDF

📄 Asking For It: Question-Answering for Predicting Rule Infractions in Online Content Moderation

2025-10-10

Авторы:

Mattia Samory, Diana Pamfile, Andrew To, Shruti Phadke

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Online communities rely on a mix of platform policies and community-authored rules to define acceptable behavior and maintain order. However, these rules vary widely across communities, evolve over time, and are enforced inconsistently, posing challenges for transparency, governance, and automation. In this paper, we model the relationship between rules and their enforcement at scale, introducing ModQ, a novel question-answering framework for rule-sensitive content moderation. Unlike prior class...

ID: 2510.06350v1 cs.CY, cs.AI, cs.CL, cs.HC, cs.LG

arXiv PDF

📄 Code2Video: A Code-centric Paradigm for Educational Video Generation

2025-10-04

Авторы:

Yanzhe Chen, Kevin Qinghong Lin, Mike Zheng Shou

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

While recent generative models advance pixel-space video synthesis, they remain limited in producing professional educational videos, which demand disciplinary knowledge, precise visual structures, and coherent transitions, limiting their applicability in educational scenarios. Intuitively, such requirements are better addressed through the manipulation of a renderable environment, which can be explicitly controlled via logical commands (e.g., code). In this work, we propose Code2Video, a code-c...

ID: 2510.01174v1 cs.CV, cs.AI, cs.CL, cs.HC, cs.MM

arXiv PDF

📄 The AI Productivity Index (APEX)

2025-10-03

Авторы:

Bertie Vidgen, Abby Fennelly, Evan Pinnix, Chirag Mahapatra, Zach Richards, Austin Bridges, Calix Huang, Ben Hunsberger, Fez Zafar, Brendan Foody, Dominic Barton, Cass R. Sunstein, Eric Topol, Osvald Nitski

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We introduce the first version of the AI Productivity Index (APEX), a benchmark for assessing whether frontier AI models can perform knowledge work with high economic value. APEX addresses one of the largest inefficiencies in AI research: outside of coding, benchmarks often fail to test economically relevant capabilities. APEX-v1.0 contains 200 test cases and covers four domains: investment banking, management consulting, law, and primary medical care. It was built in three steps. First, we sour...

ID: 2509.25721v2 econ.GN, cs.AI, cs.CL, cs.HC, q-fin.EC

arXiv PDF

📄 Believing without Seeing: Quality Scores for Contextualizing Vision-Language Model Explanations

2025-10-02

Авторы:

Keyu He, Tejas Srinivasan, Brihi Joshi, Xiang Ren, Jesse Thomason, Swabha Swayamdipta

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

When people query Vision-Language Models (VLMs) but cannot see the accompanying visual context (e.g. for blind and low-vision users), augmenting VLM predictions with natural language explanations can signal which model predictions are reliable. However, prior work has found that explanations can easily convince users that inaccurate VLM predictions are correct. To remedy undesirable overreliance on VLM predictions, we propose evaluating two complementary qualities of VLM-generated explanations v...

ID: 2509.25844v1 cs.CL, cs.HC

arXiv PDF

📄 Toxicity in Online Platforms and AI Systems: A Survey of Needs, Challenges, Mitigations, and Future Directions

2025-10-02

Авторы:

Smita Khapre, Melkamu Abay Mersha, Hassan Shakil, Jonali Baruah, Jugal Kalita

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The evolution of digital communication systems and the designs of online platforms have inadvertently facilitated the subconscious propagation of toxic behavior. Giving rise to reactive responses to toxic behavior. Toxicity in online content and Artificial Intelligence Systems has become a serious challenge to individual and collective well-being around the world. It is more detrimental to society than we realize. Toxicity, expressed in language, image, and video, can be interpreted in various w...

ID: 2509.25539v1 cs.CY, cs.AI, cs.CL, cs.HC, cs.SI

arXiv PDF

📄 Causal Autoencoder-like Generation of Feedback Fuzzy Cognitive Maps with an LLM Agent

2025-10-02

Авторы:

Akash Kumar Panda, Olaoluwa Adigun, Bart Kosko

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

A large language model (LLM) can map a feedback causal fuzzy cognitive map (FCM) into text and then reconstruct the FCM from the text. This explainable AI system approximates an identity map from the FCM to itself and resembles the operation of an autoencoder (AE). Both the encoder and the decoder explain their decisions in contrast to black-box AEs. Humans can read and interpret the encoded text in contrast to the hidden variables and synaptic webs in AEs. The LLM agent approximates the identit...

ID: 2509.25593v1 cs.AI, cs.CL, cs.HC, cs.IR

arXiv PDF

📄 The AI Productivity Index (APEX)

2025-10-02

Авторы:

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

ID: 2509.25721v1 econ.GN, cs.AI, cs.CL, cs.HC, q-fin.EC

arXiv PDF

📄 LOGOS: LLM-driven End-to-End Grounded Theory Development and Schema Induction for Qualitative Research

2025-10-01

Авторы:

Xinyu Pi, Qisen Yang, Chuong Nguyen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Grounded theory offers deep insights from qualitative data, but its reliance on expert-intensive manual coding presents a major scalability bottleneck. Current computational tools stop short of true automation, keeping researchers firmly in the loop. We introduce LOGOS, a novel, end-to-end framework that fully automates the grounded theory workflow, transforming raw text into a structured, hierarchical theory. LOGOS integrates LLM-driven coding, semantic clustering, graph reasoning, and a novel ...

ID: 2509.24294v1 cs.CL, cs.HC

arXiv PDF

Показано 31 - 40 из 73 записей