TY - JOUR
T1 - Q-Cogni
T2 - An Integrated Causal Reinforcement Learning Framework
AU - Da Costa Cunha, Cristiano
AU - Liu, Wei
AU - French, Tim
AU - Mian, Ajmal
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2024/12
Y1 - 2024/12
N2 - We present Q-Cogni, an algorithmically integrated causal reinforcement learning framework that redesigns Q-Learning to improve the learning process with causal inference. Q-Cogni achieves improved policy quality and learning efficiency with a pre-learned structural causal model of the environment, queried to guide the policy learning process with an understanding of cause-and-effect relationships in a state-action space. By doing so, we not only leverage the sample efficient techniques of reinforcement learning but also enable reasoning about a broader set of policies and bring higher degrees of interpretability to decisions made by the reinforcement learning agent. We apply Q-Cogni on Vehicle Routing Problem (VRP) environments including a real-world dataset of taxis in New York City using the Taxi & Limousine Commission trip record data. We show Q-Cogni's capability to achieve an optimally guaranteed policy (total trip distance) in 76% of the cases when comparing to shortest-path-search methods and outperforming (shorter distances) state-of-the-art reinforcement learning algorithms in 66% of cases. Additionally, since Q-Cogni doesn't require a complete global map, we show that it can start efficiently routing with partial information and improve as more data is collected, such as traffic disruptions and changes in destination, making it ideal for deployment in real-world dynamic settings.
AB - We present Q-Cogni, an algorithmically integrated causal reinforcement learning framework that redesigns Q-Learning to improve the learning process with causal inference. Q-Cogni achieves improved policy quality and learning efficiency with a pre-learned structural causal model of the environment, queried to guide the policy learning process with an understanding of cause-and-effect relationships in a state-action space. By doing so, we not only leverage the sample efficient techniques of reinforcement learning but also enable reasoning about a broader set of policies and bring higher degrees of interpretability to decisions made by the reinforcement learning agent. We apply Q-Cogni on Vehicle Routing Problem (VRP) environments including a real-world dataset of taxis in New York City using the Taxi & Limousine Commission trip record data. We show Q-Cogni's capability to achieve an optimally guaranteed policy (total trip distance) in 76% of the cases when comparing to shortest-path-search methods and outperforming (shorter distances) state-of-the-art reinforcement learning algorithms in 66% of cases. Additionally, since Q-Cogni doesn't require a complete global map, we show that it can start efficiently routing with partial information and improve as more data is collected, such as traffic disruptions and changes in destination, making it ideal for deployment in real-world dynamic settings.
KW - Artificial intelligence
KW - Reinforcement learning
KW - Vehicle routing
UR - http://www.scopus.com/inward/record.url?scp=85203449122&partnerID=8YFLogxK
U2 - 10.1109/TAI.2024.3453230
DO - 10.1109/TAI.2024.3453230
M3 - Article
AN - SCOPUS:85203449122
SN - 2691-4581
VL - 5
SP - 6186
EP - 6195
JO - IEEE Transactions on Artificial Intelligence
JF - IEEE Transactions on Artificial Intelligence
IS - 12
ER -