CS234: Reinforcement Learning Winter 2023 Course Project
In this study, reinforcement learning (RL) is applied to develop a cooperative adaptive cruise control for truck platooning application. Deep Deterministic Policy Gradient (DDPG) and Twin-Delayed Deep Deterministic Policy Gradient (TD3) methods (actor-critic, model-free, online, off-policy) are applied and compared to develop the control agent. The results indicated better learning and performance results by TD3 comparing with DDPG. The trained agent could meet the performance requirements robustly even on a high fidelity model and a real-world test scenario. Engineering the reward function, defining the states and setting neural network architecture for actor and critic were key steps in achieving a successful agent to meet the requirements. This study is done as a course project for stanford university CS234: Reinforcement Learning Winter 2023. Matlab and Simulink (R2020a) are used for simulation.
|