How Do AI Agents Do Human Work? Comparing AI and Human Workflows Across Diverse Occupations

2510.22780v1 cs.AI, cs.CL, cs.HC 2025-10-29

Авторы:

Zora Zhiruo Wang, Yijia Shao, Omar Shaikh, Daniel Fried, Graham Neubig, Diyi Yang

Abstract

AI agents are continually optimized for tasks related to human work, such as software engineering and professional writing, signaling a pressing trend with significant impacts on the human workforce. However, these agent developments have often not been grounded in a clear understanding of how humans execute work, to reveal what expertise agents possess and the roles they can play in diverse workflows. In this work, we study how agents do human work by presenting the first direct comparison of human and agent workers across multiple essential work-related skills: data analysis, engineering, computation, writing, and design. To better understand and compare heterogeneous computer-use activities of workers, we introduce a scalable toolkit to induce interpretable, structured workflows from either human or agent computer-use activities. Using such induced workflows, we compare how humans and agents perform the same tasks and find that: (1) While agents exhibit promise in their alignment to human workflows, they take an overwhelmingly programmatic approach across all work domains, even for open-ended, visually dependent tasks like design, creating a contrast with the UI-centric methods typically used by humans. (2) Agents produce work of inferior quality, yet often mask their deficiencies via data fabrication and misuse of advanced tools. (3) Nonetheless, agents deliver results 88.3% faster and cost 90.4-96.2% less than humans, highlighting the potential for enabling efficient collaboration by delegating easily programmable tasks to agents.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

How Do AI Agents Do Human Work? Comparing AI and Human Workflows Across Diverse Occupations

Авторы:

Abstract

Ссылки и действия

Связанные статьи

The AI Consumer Index (ACE)

Through the Judge's Eyes: Inferred Thinking Traces Improve Reliability of LLM Ra...

Planning Ahead with RSA: Efficient Signalling in Dynamic Environments by Project...

Everyone prefers human writers, including AI

See, Think, Act: Teaching Multimodal Agents to Effectively Interact with GUI by ...

Навигация