research-article

A two-stage RNN-based deep reinforcement learning approach for solving the parallel machine scheduling problem with due dates and family setups

Authors:

Sebastian Lang,

Tobias ReggelinAuthors Info & Claims

Journal of Intelligent Manufacturing, Volume 35, Issue 3

Pages 1107 - 1140

https://doi.org/10.1007/s10845-023-02094-4

Published: 09 March 2023 Publication History

Abstract

As an essential scheduling problem with several practical applications, the parallel machine scheduling problem (PMSP) with family setups constraints is difficult to solve and proven to be NP-hard. To this end, we present a deep reinforcement learning (DRL) approach to solve a PMSP considering family setups, aiming at minimizing the total tardiness. The PMSP is first modeled as a Markov decision process, where we design a novel variable-length representation of states and actions, so that the DRL agent can calculate a comprehensive priority for each job at each decision time point and then select the next job directly according to these priorities. Meanwhile, the variable-length state matrix and action vector enable the trained agent to solve instances of any scales. To handle the variable-length sequence and simultaneously ensure the calculated priority is a global priority among all jobs, we employ a recurrent neural network, particular gated recurrent unit, to approximate the policy of the agent. The agent is trained based on Proximal Policy Optimization algorithm. Moreover, we develop a two-stage training strategy to enhance the training efficiency. In the numerical experiments, we first train the agent on a given instance and then employ it to solve instances with much larger scales. The experimental results demonstrate the strong generalization capability of the trained agent and the comparison with three dispatching rules and two metaheuristics further validates the superiority of this agent.

References

[1]

Abu-Marrul V, Martinelli R, Hamacher S, and Gribkovskaia I Matheuristics for a parallel machine scheduling problem with nonanticipatory family setup times: Application in the offshore oil and gas industry Computers & Operations Research 2021 128

[2]

Afzalirad M and Shafipour M Design of an efficient genetic algorithm for resource-constrained unrelated parallel machine scheduling problem with machine eligibility restrictions Journal of Intelligent Manufacturing 2018 29 2 423-437

Digital Library

[3]

Anghinolfi D and Paolucci M Parallel machine total tardiness scheduling with a new hybrid metaheuristic approach Computers & Operations Research 2007 34 11 3471-3490

Digital Library

[4]

Armentano V, Yamashita DS, et al. Tabu search for scheduling on identical parallel machines to minimize mean tardiness Journal of Intelligent Manufacturing 2000 11 5 453-460

[5]

Avalos-Rosales O, Angel-Bello F, and Alvarez A Efficient metaheuristic algorithm and re-formulations for the unrelated parallel machine scheduling problem with sequence and machine-dependent setup times The International Journal of Advanced Manufacturing Technology 2015 76 9 1705-1718

[6]

Azizoglu M and Kirca O Tardiness minimization on parallel machines International Journal of Production Economics 1998 55 2 163-168

[7]

Báez S, Angel-Bello F, Alvarez A, and Melián-Batista B A hybrid metaheuristic algorithm for a parallel machine scheduling problem with dependent setup times Computers & Industrial Engineering 2019 131 295-305

Digital Library

[8]

Balin S Non-identical parallel machine scheduling using genetic algorithm Expert Systems with Applications 2011 38 6 6814-6821

Digital Library

[9]

Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. Proceedings of the 26th annual international conference on machine learning, (pp. 41–48).

[10]

Bengio Y, Simard P, and Frasconi P Learning long-term dependencies with gradient descent is difficult IEEE Transactions on Neural Networks 1994 5 2 157-166

Digital Library

[11]

Biskup D, Herrmann J, and Gupta JND Scheduling identical parallel machines to minimize total tardiness International Journal of Production Economics 2008 115 1 134-142

[12]

Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). Openai gym. arXiv:1606.01540.

[13]

Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the properties of neural machine translation: Encoder–decoder approaches. arXiv:1409.1259.

[14]

Cochran JK, Horng S-M, and Fowler JW A multi-population genetic algorithm to solve multi-objective scheduling problems for parallel machines Computers & Operations Research 2003 30 7 1087-1102

Digital Library

[15]

Elman JL Finding structure in time Cognitive Science 1990 14 2 179-211

[16]

Fang K-T and Lin BM Parallel-machine scheduling to minimize tardiness penalty and power cost Computers & Industrial Engineering 2013 64 1 224-234

Digital Library

[17]

Gavett JW Three heuristic rules for sequencing jobs to a single production facility Management Science 1965 11 8 166-176

Digital Library

[18]

Graham RL, Lawler EL, Lenstra JK, and Kan AR Optimization and approximation in deterministic sequencing and scheduling: A survey Annals of Discrete Mathematics 1979 5 287-326

[19]

Guo, L., Zhuang, Z., Huang, Z., & Qin, W. (2020). Optimization of dynamic multi-objective non-identical parallel machine scheduling with multistage reinforcement learning. 2020 IEEE 16th international conference on automation science and engineering (CASE), (pp. 1215–1219).

[20]

Kayhan BM and Yildiz G Reinforcement learning applications to machine scheduling problems: A comprehensive literature review Journal of Intelligent Manufacturing 2021 34 1-25

[21]

Kim Y-D, Joo B-J, and Choi S-Y Scheduling wafer lots on diffusion machines in a semiconductor wafer fabrication facility IEEE Transactions on Semiconductor Manufacturing 2010 23 2 246-254

[22]

Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980 .

[23]

Lang S, Behrendt F, Lanzerath N, Reggelin T, and Müller M Integration of deep reinforcement learning and discrete-event simulation for real-time scheduling of a flexible job shop production Winter Simulation Conference (WSC) 2020 2020 3057-3068

[24]

Lang S, Kuetgens M, Reichardt P, and Reggelin T Modeling production scheduling problems as reinforcement learning environments based on discrete-event simulation and openai gym IFAC-PapersOnLine 2021 54 1 793-798

[25]

Lee Z-J, Lin S-W, and Ying K-C Scheduling jobs on dynamic parallel machines with sequence-dependent setup times The International Journal of Advanced Manufacturing Technology 2010 47 5 773-781

[26]

Liu C-L, Chang C-C, and Tseng C-J Actor-critic deep reinforcement learning for solving job shop scheduling problems IEEE Access 2020 8 71752-71762

[27]

Luo S Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning Applied Soft Computing 2020 91

[28]

Paeng B, Park I-B, and Park J Deep reinforcement learning for minimizing tardiness in parallel machine scheduling with sequence dependent family setups IEEE Access 2021 9 101390-101401

[29]

Pickardt CW and Branke J Setup-oriented dispatching rules-a survey International Journal of Production Research 2012 50 20 5823-5842

[30]

Potts CN and Van Wassenhove LN A branch and bound algorithm for the total weighted tardiness problem Operations Research 1985 33 2 363-377

Digital Library

[31]

Rajendran C and Holthaus O A comparative study of dispatching rules in dynamic flowshops and jobshops European Journal of Operational Research 1999 116 1 156-170

[32]

Rodríguez MLR, Kubler S, de Giorgio A, Cordy M, Robert J, and Le Traon Y Multi-agent deep reinforcement learning based predictive maintenance on parallel machines Robotics and Computer-Integrated Manufacturing 2022 78

Digital Library

[33]

Rolf B, Reggelin T, Nahhas A, Lang S, and Müller M Assigning dispatching rules using a genetic algorithm to solve a hybrid flow shop scheduling problem Procedia Manufacturing 2020 42 442-449

[34]

Schaller JE Minimizing total tardiness for scheduling identical parallel machines with family setups Computers & Industrial Engineering 2014 72 274-281

[35]

Schulman, J., Levine, S., Abbeel, P., Jordan, M., & Moritz, P. (2015). Trust region policy optimization. International conference on machine learning, (pp. 1889–1897).

[36]

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv:1707.06347 .

[37]

Shin HJ and Leon VJ Scheduling with product family set-up times: An application in TFT LCD manufacturing International Journal of Production Research 2004 42 20 4235-4248

[38]

Sigtia, S., Benetos, E., Cherla, S., Weyde, T., Garcez, A., & Dixon, S. (2014). RNN-based music language models for improving automatic music transcription. Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), (pp. 53–58).

[39]

Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, et al. Mastering the game of go without human knowledge Nature 2017 550 7676 354-359

[40]

Sun S, Cao Z, Zhu H, and Zhao J A survey of optimization methods from a machine learning perspective IEEE Transactions on Cybernetics 2019 50 8 3668-3681

[41]

Sutton RS and Barto AG Reinforcement learning: An introduction 2018 MIT Press

Digital Library

[42]

Tassel, P., Gebser, M., & Schekotihin, K. (2021). A reinforcement learning environment for job-shop scheduling. arXiv:2104.03760.

[43]

van der Ham R salabim: Discrete event simulation and animation in python Journal of Open Source Software 2018 3 27 767

[44]

van der Zee D-J Family-based dispatching with parallel machines International Journal of Production Research 2015 53 19 5837-5856

[45]

Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Dudzik A, Chung J, et al. Grandmaster level in starcraft ii using multi-agent reinforcement learning Nature 2019 575 7782 350-354

[46]

Wang L, Pan Z, and Wang J A review of reinforcement learning based intelligent optimization for manufacturing scheduling Complex System Modeling and Simulation 2021 1 4 257-270

[47]

Werbos PJ Backpropagation through time: What it does and how to do it Proceedings of the IEEE 1990 78 10 1550-1560

[48]

Wilbrecht, J. K., & Prescott, W. B. (1969). The influence of setup time on job shop performance. Management Science,16(4), 274–280.

[49]

Wu, Y., & Tian, Y. (2016). Training agent for first-person shooter game with actor-critic curriculum learning.

[50]

Yin, W., Kann, K., Yu, M., & Schütze, H. (2017). Comparative study of cnn and rnn for natural language processing. arXiv:1702.01923.

[51]

Ying K-C and Cheng H-M Dynamic parallel machine scheduling with sequence-dependent setup times using an iterated greedy heuristic Expert Systems with Applications 2010 37 4 2848-2852

Digital Library

[52]

Yuan B, Jiang Z, and Wang L Dynamic parallel machine scheduling with random breakdowns using the learning agent International Journal of Services Operations and Informatics 2016 8 2 94-103

[53]

Yuan B, Wang L, and Jiang Z Dynamic parallel machine scheduling using the learning agent IEEE International Conference on Industrial Engineering and Engineering management 2013 2013 1565-1569

[54]

Zeidi JR and MohammadHosseini S Scheduling unrelated parallel machines with sequence-dependent setup times The International Journal of Advanced Manufacturing Technology 2015 81 9 1487-1496

[55]

Zhang C, Liu Y, Wu F, Tang B, and Fan W Effective charging planning based on deep reinforcement learning for electric vehicles IEEE Transactions on Intelligent Transportation Systems 2020 22 1 542-554

Digital Library

[56]

Zhang Z, Zheng L, Hou F, and Li N Semiconductor final test scheduling with sarsa (

λ

, k) algorithm European Journal of Operational Research 2011 215 2 446-458

[57]

Zhang Z, Zheng L, Li N, Wang W, Zhong S, and Hu K Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning Computers & Operations Research 2012 39 7 1315-1324

Digital Library

[58]

Zhang Z, Zheng L, and Weng MX Dynamic parallel machine scheduling with mean weighted tardiness objective by q-learning The International Journal of Advanced Manufacturing Technology 2007 34 9 968-980

[59]

Zhou, D., Jia, R., & Yao, H. (2021). Robotic arm motion planning based on curriculum reinforcement learning. 2021 6th International Conference on Control and Robotics Engineering (ICCRE), (pp. 44–49).

[60]

Zhou L, Zhang L, and Horn BK Deep reinforcement learning-based dynamic scheduling in smart manufacturing Procedia CIRP 2020 93 383-388

Recommendations

Single-Machine Scheduling with Release Dates, Due Dates and Family Setup Times

We address the NP-hard problem of scheduling n independent jobs with release dates, due dates, and family setup times on a single machine to minimize the maximum lateness. This problem arises from the constant tug-of-war going on in manufacturing ...
Earliness/tardiness scheduling with a common due date and family setups

We examine the problem of scheduling jobs, on a single processor to minimise the total earliness and tardiness costs about a common due date. Jobs belong to mutually exclusive families and a sequence independent setup task is required when processing ...
Parallel batch scheduling of equal-length jobs with release and due dates

In this paper we study parallel batch scheduling problems with bounded batch capacity and equal-length jobs in a single and parallel machine environment. It is shown that the feasibility problem 1|p-batch, b < n , r _j, p _j= p , C _j ...

Comments

Information & Contributors

Information

Published In

cover image Journal of Intelligent Manufacturing

Journal of Intelligent Manufacturing Volume 35, Issue 3

Mar 2024

458 pages

Issue’s Table of Contents

© The Author(s) 2023.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 09 March 2023

Accepted: 09 February 2023

Received: 11 April 2022

Author Tags

Qualifiers

Research-article

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents