Using deep reinforcement learning networks to provide dynamic scheduling in workplace environments

Document Type : Original Article

Authors
1 Department of Computer Engineering, Faculty of Computer, Lian Institute of Higher Education, Bushehr, Iran
2 Department of Computer Engineering, Bu.C., Islamic Azad University, Bushehr, Iran
3 Department of Electrical Engineering, Faculty of Electrical Engineering, Lian Institute of Higher Education, Bushehr, Iran
Abstract
The dynamic scheduling problem in modern work environments, such as data centers and cloud computing systems, is considered one of the complex challenges in the field of system optimization due to the heterogeneous and variable nature of input tasks. Traditional and static scheduling methods such as FIFO and SJF, due to their inability to adapt to the real-time and dynamic conditions of the environment, do not show the necessary efficiency for optimizing key performance metrics. In this paper, an asynchronous conditional policy factoring algorithm is presented for dynamic scheduling. This algorithm, by utilizing the policy factoring mechanism, enables learning complex and coordinated policies and improves the convergence speed and efficiency of the learning process by using asynchronous updates. This approach allows the system to effectively deal with the uncertainty and dynamics of the environment and allocate resources optimally. The experimental results clearly demonstrated the absolute superiority of the proposed algorithm in all evaluation criteria, including total task completion time and average task waiting time. The proposed algorithm was able to reduce Makespan by 5% and average waiting time by 13% compared to the best reference algorithm, namely QMIX.

Keywords