LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Robustness Analysis and Enhancement of Deep Reinforcement Learning-Based Schedulers

Photo by hajjidirir from unsplash

Dependency-aware jobs, such as the big data analytic workflows, are commonly executed on the cloud. They are compiled to directed acyclic graphs, with tasks linked in regarding the dependency. The… Click to show full abstract

Dependency-aware jobs, such as the big data analytic workflows, are commonly executed on the cloud. They are compiled to directed acyclic graphs, with tasks linked in regarding the dependency. The cloud scheduler, which maintains a large number of resources, is responsible to execute tasks in parallel. To resolve the complex dependencies, Deep Reinforcement Learning (DRL) based schedulers are widely applied. However, we find that the DRL-based schedulers are vulnerable to the perturbations in the input jobs and may generate falsified decisions to benefit a particular job while delaying the others. By perturbation, we mean a slight adjustment to the job's node features or dependencies, while not changing its functionality. In this paper, we first explore the vulnerability of DRL-based schedulers to job perturbations without accessing the information of the DRL models used in the scheduler. We devise the black-box perturbation system, in which, a proxy model is trained to mimic the DRL-based scheduling policy. We show that the high-faith proxy model can help to craft effective perturbations. The DRL-based schedulers can be as high as 60% likely to be badly affected by the perturbations. Then, we investigate the solution to improve the robustness of DRL-based schedulers to such perturbations. We propose an adversarial training framework to force the neural model to adapt to the perturbation patterns during training so as to eliminate the potential damage during applications. Experiments show that the adversarial-trained scheduler is more robust, reducing the chance of being affected to 3-fold less and the potential bad effects halved.

Keywords: drl based; based schedulers; reinforcement learning; deep reinforcement

Journal Title: IEEE Transactions on Parallel and Distributed Systems
Year Published: 2023

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.