Non-myopic Matching and Rebalancing in Large-Scale On-Demand Ride-Pooling Systems Using Simulation-Informed Reinforcement Learning
2510.25796v1
cs.LG, cs.AI, cs.CY
2025-11-01
Авторы:
Farnoosh Namdarpour, Joseph Y. J. Chow
Abstract
Ride-pooling, also known as ride-sharing, shared ride-hailing, or
microtransit, is a service wherein passengers share rides. This service can
reduce costs for both passengers and operators and reduce congestion and
environmental impacts. A key limitation, however, is its myopic
decision-making, which overlooks long-term effects of dispatch decisions. To
address this, we propose a simulation-informed reinforcement learning (RL)
approach. While RL has been widely studied in the context of ride-hailing
systems, its application in ride-pooling systems has been less explored. In
this study, we extend the learning and planning framework of Xu et al. (2018)
from ride-hailing to ride-pooling by embedding a ride-pooling simulation within
the learning mechanism to enable non-myopic decision-making. In addition, we
propose a complementary policy for rebalancing idle vehicles. By employing
n-step temporal difference learning on simulated experiences, we derive
spatiotemporal state values and subsequently evaluate the effectiveness of the
non-myopic policy using NYC taxi request data. Results demonstrate that the
non-myopic policy for matching can increase the service rate by up to 8.4%
versus a myopic policy while reducing both in-vehicle and wait times for
passengers. Furthermore, the proposed non-myopic policy can decrease fleet size
by over 25% compared to a myopic policy, while maintaining the same level of
performance, thereby offering significant cost savings for operators.
Incorporating rebalancing operations into the proposed framework cuts wait time
by up to 27.3%, in-vehicle time by 12.5%, and raises service rate by 15.1%
compared to using the framework for matching decisions alone at the cost of
increased vehicle minutes traveled per passenger.
Ссылки и действия
Дополнительные ресурсы: