📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Introduction to Automated Negotiation

2025-11-15

Авторы:

Dave de Jonge

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This book is an introductory textbook targeted towards computer science students who are completely new to the topic of automated negotiation. It does not require any prerequisite knowledge, except for elementary mathematics and basic programming skills. This book comes with an simple toy-world negotiation framework implemented in Python that can be used by the readers to implement their own negotiation algorithms and perform experiments with them. This framework is small and simple enough tha...

ID: 2511.08659v1 cs.MA, cs.AI, cs.GT

arXiv PDF

📄 Good-for-MDP State Reduction for Stochastic LTL Planning

2025-11-15

Авторы:

Christoph Weinhuber, Giuseppe De Giacomo, Yong Li, Sven Schewe, Qiyi Tang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study stochastic planning problems in Markov Decision Processes (MDPs) with goals specified in Linear Temporal Logic (LTL). The state-of-the-art approach transforms LTL formulas into good-for-MDP (GFM) automata, which feature a restricted form of nondeterminism. These automata are then composed with the MDP, allowing the agent to resolve the nondeterminism during policy synthesis. A major factor affecting the scalability of this approach is the size of the generated automata. In this paper, w...

ID: 2511.09073v1 cs.FL, cs.AI, cs.GT

arXiv PDF

📄 LLMs as Strategic Agents: Beliefs, Best Response Behavior, and Emergent Heuristics

2025-10-16

Авторы:

Enric Junque de Fortuny, Veronica Roberta Cappelli

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) are increasingly applied to domains that require reasoning about other agents' behavior, such as negotiation, policy design, and market simulation, yet existing research has mostly evaluated their adherence to equilibrium play or their exhibited depth of reasoning. Whether they display genuine strategic thinking, understood as the coherent formation of beliefs about other agents, evaluation of possible actions, and choice based on those beliefs, remains unexplored. W...

ID: 2510.10813v1 cs.AI, cs.GT

arXiv PDF

📄 People use fast, flat goal-directed simulation to reason about novel problems

2025-10-15

Авторы:

Katherine M. Collins, Cedegao E. Zhang, Lionel Wong, Mauricio Barba da Costa, Graham Todd, Adrian Weller, Samuel J. Cheyette, Thomas L. Griffiths, Joshua B. Tenenbaum

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Games have long been a microcosm for studying planning and reasoning in both natural and artificial intelligence, especially with a focus on expert-level or even super-human play. But real life also pushes human intelligence along a different frontier, requiring people to flexibly navigate decision-making problems that they have never thought about before. Here, we use novice gameplay to study how people make decisions and form judgments in new problem settings. We show that people are systemati...

ID: 2510.11503v1 q-bio.NC, cs.AI, cs.GT

arXiv PDF

📄 GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare

2025-10-14

Авторы:

Siqi Zhu, David Zhang, Pedro Cisneros-Velarde, Jiaxuan You

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) have achieved remarkable progress in reasoning, yet sometimes produce responses that are suboptimal for users in tasks such as writing, information seeking, or providing practical guidance. Conventional alignment practices typically assume that maximizing model reward also maximizes user welfare, but this assumption frequently fails in practice: models may over-clarify or generate overly verbose reasoning when users prefer concise answers. Such behaviors resemble the...

ID: 2510.08872v1 cs.AI, cs.GT, cs.HC, cs.LG, cs.MA

arXiv PDF

📄 The Hidden Game Problem

2025-10-08

Авторы:

Gon Buzaglo, Noah Golowich, Elad Hazan

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper investigates a class of games with large strategy spaces, motivated by challenges in AI alignment and language games. We introduce the hidden game problem, where for each player, an unknown subset of strategies consistently yields higher rewards compared to the rest. The central question is whether efficient regret minimization algorithms can be designed to discover and exploit such hidden structures, leading to equilibrium in these subgames while maintaining rationality in general. W...

ID: 2510.03845v1 cs.AI, cs.GT, cs.LG, stat.ML

arXiv PDF

📄 Look-ahead Reasoning with a Learned Model in Imperfect Information Games

2025-10-08

Авторы:

Ondřej Kubíček, Viliam Lisý

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Test-time reasoning significantly enhances pre-trained AI agents' performance. However, it requires an explicit environment model, often unavailable or overly complex in real-world scenarios. While MuZero enables effective model learning for search in perfect information games, extending this paradigm to imperfect information games presents substantial challenges due to more nuanced look-ahead reasoning techniques and large number of states relevant for individual decisions. This paper introduce...

ID: 2510.05048v1 cs.AI, cs.GT

arXiv PDF

📄 Beyond Majority Voting: LLM Aggregation by Leveraging Higher-Order Information

2025-10-04

Авторы:

Rui Ai, Yuqi Pan, David Simchi-Levi, Milind Tambe, Haifeng Xu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

With the rapid progress of multi-agent large language model (LLM) reasoning, how to effectively aggregate answers from multiple LLMs has emerged as a fundamental challenge. Standard majority voting treats all answers equally, failing to consider latent heterogeneity and correlation across models. In this work, we design two new aggregation algorithms called Optimal Weight (OW) and Inverse Surprising Popularity (ISP), leveraging both first-order and second-order information. Our theoretical analy...

ID: 2510.01499v1 cs.LG, cs.AI, cs.GT

arXiv PDF

📄 Incentive-Aligned Multi-Source LLM Summaries

2025-10-02

Авторы:

Yanchen Jiang, Zhe Feng, Aranyak Mehta

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) are increasingly used in modern search and answer systems to synthesize multiple, sometimes conflicting, texts into a single response, yet current pipelines offer weak incentives for sources to be accurate and are vulnerable to adversarial content. We introduce Truthful Text Summarization (TTS), an incentive-aligned framework that improves factual robustness without ground-truth labels. TTS (i) decomposes a draft synthesis into atomic claims, (ii) elicits each source...

ID: 2509.25184v1 cs.CL, cs.AI, cs.GT

arXiv PDF

📄 SpinGPT: A Large-Language-Model Approach to Playing Poker Correctly

2025-09-30

Авторы:

Narada Maugin, Tristan Cazenave

## Контекст Область исследования — искусственный интеллект (ИИ) в играх, специально в покере. Игры, которым необходима стратегия и интеллектуальное решение, часто создают сложные задачи для ИИ. Одним из таких видов игр является покер, особенно в формате Spin & Go — трехпользовательская версия, где игрокам требуется решать сложные эквариумумные задачи в неидеальной информационной среде. Традиционное решение — Counterfactual Regret Minimization (CFR) — имеет высокую сложность вычислений при увеличении числа участников. Более того, в играх с тремя или более игроками Nash equilibrium не гарантирует выигрыша. Эти ограничения подчеркивают необходимость развития новых подходов, особенно в популярных турнирных форматах. Возникает мотивация для исследования новых методов, включая использование больших языковых моделей (LLM). ## Метод SpinGPT разработан в двух этапах. На первом этапе использовался **Supervised Fine-Tuning** на базе 320 тысяч высокостактных решений экспертов. На втором этапе применено **Reinforcement Learning** с использованием 270 тысяч генерируемых техническими средами решений. Это позволило модели SpinGPT оптимизировать свои решения на основе двух разных подходов: 1) соблюдения лучших традиционных стратегий и 2) учет неопределенности игровых ситуаций. Такая двухэтапная архитектура позволяет SpinGPT решать задачи в Spin & Go, где требуется быстрая адаптация и точное решение в условиях неидеальной информации. ## Результаты В результате выполнения SpinGPT совпадает с решениями стандартных сред (solver) в 78% случаев (tolerant accuracy). Это указывает на высокую точность модели в соблюдении оптимальных стратегий. Также SpinGPT демонстрирует выигрышную стратегию в голодающих технических боях (heads-up), где он показывает результат +13.4 +/- 12.9 BB/100 (условия: 30 тысяч рук, 95% доверительный интервал). Эти результаты показывают, что SpinGPT эффективно решает задачи в Spin & Go и может стать новым поводом для исследований в области ИИ в покере. ## Значимость Потенциальное применение SpinGPT распространяется на большое количество активных игровых форматов, включая покер в турнирных форматах. Избавление модели от некоторых ограничений CFR делает ее мощным инструментом для развития ИИ в покере. В частности, SpinGPT может использоваться в турнирах, где требуется быстрое решение задач в условиях неидеальной информации. Это может привести к новым возможностям для стратегического подхода в игре и даже к развитию новых методов поиска оптимальных решений в других играх с неопределенностью. ## Выводы Исследование SpinGPT показало, что большие языковые модели могут б

Annotation:

The Counterfactual Regret Minimization (CFR) algorithm and its variants have enabled the development of pokerbots capable of beating the best human players in heads-up (1v1) cash games and competing with them in six-player formats. However, CFR's computational complexity rises exponentially with the number of players. Furthermore, in games with three or more players, following Nash equilibrium no longer guarantees a non-losing outcome. These limitations, along with others, significantly restrict...

ID: 2509.22387v1 cs.LG, cs.AI, cs.GT

arXiv PDF

Показано 11 - 20 из 30 записей