Benefiting from the flexibility and low operational cost, dispatching unmanned aerial vehicles (UAVs) to collect measurements is promising in spectrum cartography (SC). The main goal is to optimize the trajectory… Click to show full abstract
Benefiting from the flexibility and low operational cost, dispatching unmanned aerial vehicles (UAVs) to collect measurements is promising in spectrum cartography (SC). The main goal is to optimize the trajectory of an UAV to seek the most informative measurement under the environment with dynamic emitters. In this letter, we formulate a Markov Decision Process to find the optimal flight trajectory of an UAV that maximizes the accuracy of SC and minimizes energy consumption. However, due to the unavailable instantaneous feedback about the accuracy of SC, the existing methods are unable to work efficiently with sparse feedback. To tackle those issues, a Proximal Policy Optimization (PPO)-based algorithm is proposed to approach the optimal navigation policy for UAV by training with the delay interactive information at the base station. Moreover, A backtracking advantage function is further constructed to cope with sparse feedback in real-world scenario, which can avoid converging to local solutions. Extensive simulation results demonstrate that our proposed algorithm can significantly increase the accuracy of SC while reducing energy consumption.
               
Click one of the above tabs to view related content.