Reinforcement Learning with Reasoning for Long-horizon Robotic Tasks
Abstract
Recent developments in Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehi- cles (UGVs) have attracted intensive research interest from both academia and industrial areas. Although Unmanned Aerial Vehicles (UAV) are usually deployed outdoors, there is increasing interest in applying UAVs for indoor applications. It is a highly attractive and challenging task to precisely localize a UAV in an indoor environment where Global Positioning System (GPS) service is absent. To achieve high accuracy and low cost for localization, a Radio-frequency Identification (RFID) enhanced UAV system that provides a precise 6 degrees of freedom (6- DoF) pose for UAVs. Moreover, UGVs are good compliments for UAVs which made them be widely used for various tasks. Therefore, the control commands communication between UAVs and UGVs is crucial as well. Furthermore, they are both constrained by some essential features that make them incapable of completing complicated tasks in many scenarios. For example, the UGV cannot reach high altitudes, while the UAV is limited by its power sup- ply and smaller payload capacity. In my dissertation, I want to present a deep reinforcement learning(DRL)-based network that could generate an optimal strategy to make a UGV and UAV form a coalition that is complementary and cooperative for the completion of tasks that they are incapable of achieving alone. At the same time, I also would like to discuss the challenge and solutions when using DRL methods for solving such long-horizon robotic tasks. DRL methods usually suffer when the state and action spaces are very large. So the way we handle the obser- vations during training is essential. In the last section, a reasoning scheme that enables robots better understand their tasks in the environment is investigated to promote the intelligence and robustness of the cooperation system.