Reinforcement Learning based Fragment-Aware Scheduling for High Utilization HPC Platforms

Lung Pin Chen, I. Chen Wu, Yen Ling Chang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Due to high capacity and complex scheduling activities, a HPC platform often creates resource fragments with low usability. This paper develops a novel fragment-aware scheduling approach which improves system utilization by fitting elastic lightweight tasks to the fragments of resources dynamically. The new approach employs a threshold to determine the balancing factor between the length of tasks and the degree of granularity of the resource fragments. We employ the PPO reinforcement learning approach to train a neural network that can compute the threshold precisely. With the threshold that is adaptive to the changing system states, the PPO-based scheduler is able to utilize the idle resources and maximize the execution success rate of the tasks.

Original languageEnglish
Title of host publicationProceedings - 2019 International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728146669
DOIs
StatePublished - Nov 2019
Event24th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2019 - Kaohsiung, Taiwan
Duration: 21 Nov 201923 Nov 2019

Publication series

NameProceedings - 2019 International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2019

Conference

Conference24th International Conference on Technologies and Applications of Artificial Intelligence, TAAI 2019
CountryTaiwan
CityKaohsiung
Period21/11/1923/11/19

Keywords

  • High-performance computing
  • malleable task
  • reinforcement learning
  • scheduling

Fingerprint Dive into the research topics of 'Reinforcement Learning based Fragment-Aware Scheduling for High Utilization HPC Platforms'. Together they form a unique fingerprint.

Cite this