📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Large Language Models Require Curated Context for Reliable Political Fact-Checking -- Even with Reasoning and Web Search

2025-11-26

Авторы:

Matthew R. DeVerna, Kai-Cheng Yang, Harry Yaojun Yan, Filippo Menczer

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) have raised hopes for automated end-to-end fact-checking, but prior studies report mixed results. As mainstream chatbots increasingly ship with reasoning capabilities and web search tools -- and millions of users already rely on them for verification -- rigorous evaluation is urgent. We evaluate 15 recent LLMs from OpenAI, Google, Meta, and DeepSeek on more than 6,000 claims fact-checked by PolitiFact, comparing standard models with reasoning- and web-search variants...

ID: 2511.18749v1 cs.CL, cs.CY, cs.IR

arXiv PDF

📄 Evaluation of LLMs for Process Model Analysis and Optimization

2025-10-11

Авторы:

Akhil Kumar, Jianliang Leon Zhao, Om Dobariya

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In this paper, we report our experience with several LLMs for their ability to understand a process model in an interactive, conversational style, find syntactical and logical errors in it, and reason with it in depth through a natural language (NL) interface. Our findings show that a vanilla, untrained LLM like ChatGPT (model o3) in a zero-shot setting is effective in understanding BPMN process models from images and answering queries about them intelligently at syntactic, logic, and semantic l...

ID: 2510.07489v1 cs.AI, cs.CL, cs.CY, cs.IR, cs.LG

arXiv PDF