This paper proposes a low-level quadrotor control algorithm using neural networks with model-free reinforcement learning, then explores the algorithm's capabilities on quadrotor hover and tracking tasks. We provide a new point of view by examining the well-known policy gradient algorithm from reinforcement learning, then relaxing its requirements to improve training efficiency. Without requiring expert demonstrations, the improved algorithm is then applied to train a quadrotor controller with its output directly mapped to four actuators in a simulator, which is a technique used to control any linear or nonlinear system under unknown dynamic parameters and disturbances. We show two experimental tasks both in simulation and real-world quadrotors to verify our method and demonstrate performance: 1) hovering at a fixed position, and 2) tracking along a specific trajectory. The video of our experiments can be found at https://youtu.be/oEVcdiFPnMo.
- Policy gradient
- Reinforcement learning