This paper introduces a multi-objective multi-agent framework for traffic light control. In particular, each agent in the proposed framework applies a multi-objective Markov decision process. For intelligent control, a reinforcement learning (RL) algorithm is enhanced with multiple-step backups and a function approximation approach to build the agent's knowledge. Moreover, a thresholded lexicographic ordering (TLO) action policy is integrated with the enhanced RL algorithm to solve the multi-objective control problem, which is reformulated by a constrained Markov decision process. A case study of three intersections is carried out and demonstrates the approach with a conventional stage-based phasing strategy using traffic simulation. The simulation experiments elaborate the benefits brought by MAMOD-TL system compared with optimized fixed-time controllers. More importantly, the Pareto optimality is approximately obtained by setting different control parameters for TLO action policy, which can be considered as a performance metric for decision makers.
QC 20180403