📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 LORE: A Large Generative Model for Search Relevance

2025-12-05

Авторы:

Chenji Lu, Zhuo Chen, Hui Zhao, Zhiyuan Zeng, Gang Zhao, Junjie Ren, Ruicong Xu, Haoran Li, Songyan Liu, Pengjie Wang, Jian Xu, Bo Zheng

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Achievement. We introduce LORE, a systematic framework for Large Generative Model-based relevance in e-commerce search. Deployed and iterated over three years, LORE achieves a cumulative +27\% improvement in online GoodRate metrics. This report shares the valuable experience gained throughout its development lifecycle, spanning data, features, training, evaluation, and deployment. Insight. While existing works apply Chain-of-Thought (CoT) to enhance relevance, they often hit a performance ceilin...

ID: 2512.03025v2 cs.IR, cs.AI, cs.CL, cs.LG

arXiv PDF

📄 M3DR: Towards Universal Multilingual Multimodal Document Retrieval

2025-12-04

Авторы:

Adithya S Kolavi, Vyoman Jain

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multimodal document retrieval systems have shown strong progress in aligning visual and textual content for semantic search. However, most existing approaches remain heavily English-centric, limiting their effectiveness in multilingual contexts. In this work, we present M3DR (Multilingual Multimodal Document Retrieval), a framework designed to bridge this gap across languages, enabling applicability across diverse linguistic and cultural contexts. M3DR leverages synthetic multilingual document d...

ID: 2512.03514v1 cs.IR, cs.AI, cs.CL, cs.CV

arXiv PDF

📄 LORE: A Large Generative Model for Search Relevance

2025-12-04

Авторы:

Chenji Lu, Zhuo Chen, Hui Zhao, Zhiyuan Zeng, Gang Zhao, Junjie Ren, Ruicong Xu, Haoran Li, Songyan Liu, Pengjie Wang, Jian Xu, Bo Zheng

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

ID: 2512.03025v1 cs.IR, cs.AI, cs.CL, cs.LG

arXiv PDF

📄 What Drives Cross-lingual Ranking? Retrieval Approaches with Multilingual Language Models

2025-11-26

Авторы:

Roksana Goworek, Olivia Macmillan-Scott, Eda B. Özyiğit

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Cross-lingual information retrieval (CLIR) enables access to multilingual knowledge but remains challenging due to disparities in resources, scripts, and weak cross-lingual semantic alignment in embedding models. Existing pipelines often rely on translation and monolingual retrieval heuristics, which add computational overhead and noise, degrading performance. This work systematically evaluates four intervention types, namely document translation, multilingual dense retrieval with pretrained enc...

ID: 2511.19324v1 cs.IR, cs.AI, cs.CL

arXiv PDF

📄 Generative Query Expansion with Multilingual LLMs for Cross-Lingual Information Retrieval

2025-11-26

Авторы:

Olivia Macmillan-Scott, Roksana Goworek, Eda B. Özyiğit

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Query expansion is the reformulation of a user query by adding semantically related information, and is an essential component of monolingual and cross-lingual information retrieval used to ensure that relevant documents are not missed. Recently, multilingual large language models (mLLMs) have shifted query expansion from semantic augmentation with synonyms and related words to pseudo-document generation. Pseudo-documents both introduce additional relevant terms and bridge the gap between short ...

ID: 2511.19325v1 cs.IR, cs.AI, cs.CL

arXiv PDF

📄 The Oracle and The Prism: A Decoupled and Efficient Framework for Generative Recommendation Explanation

2025-11-21

Авторы:

Jiaheng Zhang, Daqiang Zhang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The integration of Large Language Models (LLMs) into explainable recommendation systems often leads to a performance-efficiency trade-off in end-to-end architectures, where joint optimization of ranking and explanation can result in suboptimal compromises. To resolve this, we propose Prism, a novel decoupled framework that rigorously separates the recommendation process into a dedicated ranking stage and an explanation generation stage. Inspired by knowledge distillation, Prism leverages a pow...

ID: 2511.16543v1 cs.IR, cs.AI, cs.CL, cs.LG

arXiv PDF

📄 Exploring Multi-Table Retrieval Through Iterative Search

2025-11-19

Авторы:

Allaa Boutaleb, Bernd Amann, Rafael Angarita, Hubert Naacke

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Open-domain question answering over datalakes requires retrieving and composing information from multiple tables, a challenging subtask that demands semantic relevance and structural coherence (e.g., joinability). While exact optimization methods like Mixed-Integer Programming (MIP) can ensure coherence, their computational complexity is often prohibitive. Conversely, simpler greedy heuristics that optimize for query coverage alone often fail to find these coherent, joinable sets. This paper fra...

ID: 2511.13418v1 cs.IR, cs.AI, cs.CL, cs.DB, cs.LG

arXiv PDF

📄 PaperAsk: A Benchmark for Reliability Evaluation of LLMs in Paper Search and Reading

2025-10-29

Авторы:

Yutao Wu, Xiao Liu, Yunhao Feng, Jiale Ding, Xingjun Ma

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) increasingly serve as research assistants, yet their reliability in scholarly tasks remains under-evaluated. In this work, we introduce PaperAsk, a benchmark that systematically evaluates LLMs across four key research tasks: citation retrieval, content extraction, paper discovery, and claim verification. We evaluate GPT-4o, GPT-5, and Gemini-2.5-Flash under realistic usage conditions-via web interfaces where search operations are opaque to the user. Through controlle...

ID: 2510.22242v1 cs.IR, cs.AI, cs.CL

arXiv PDF

📄 REVISION:Reflective Intent Mining and Online Reasoning Auxiliary for E-commerce Visual Search System Optimization

2025-10-29

Авторы:

Yiwen Tang, Qiuyu Zhao, Zenghui Sun, Jinsong Lan, Xiaoyong Zhu, Bo Zheng, Kaifu Zhang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In Taobao e-commerce visual search, user behavior analysis reveals a large proportion of no-click requests, suggesting diverse and implicit user intents. These intents are expressed in various forms and are difficult to mine and discover, thereby leading to the limited adaptability and lag in platform strategies. This greatly restricts users' ability to express diverse intents and hinders the scalability of the visual search system. This mismatch between user implicit intent expression and syste...

ID: 2510.22739v1 cs.IR, cs.AI, cs.CL

arXiv PDF

📄 Pctx: Tokenizing Personalized Context for Generative Recommendation

2025-10-28

Авторы:

Qiyong Zhong, Jiajie Su, Yunshan Ma, Julian McAuley, Yupeng Hou

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Generative recommendation (GR) models tokenize each action into a few discrete tokens (called semantic IDs) and autoregressively generate the next tokens as predictions, showing advantages such as memory efficiency, scalability, and the potential to unify retrieval and ranking. Despite these benefits, existing tokenization methods are static and non-personalized. They typically derive semantic IDs solely from item features, assuming a universal item similarity that overlooks user-specific perspe...

ID: 2510.21276v1 cs.IR, cs.AI, cs.CL

arXiv PDF

Показано 1 - 10 из 38 записей