Dynamic scheduling in large-scale manufacturing processing systems using multi-agent reinforcement learning

Placeholder Show Content

Abstract/Contents

Abstract
Scheduling in manufacturing plays an essential role in building smart manufacturing from multiple points of view, including social, economic, and environmental. Optimal scheduling, or the allocation of jobs with different requirements for a manufacturing processing system to meet various objectives, has been discussed for several decades. However, advanced scheduling methods in modern processing systems have not significantly improved, nor have they been widely adopted by staff working on manufacturing production lines despite extensive research conducted into scheduling. Most traditional scheduling methods require statistical assumptions, which cannot support operations for a dynamic and stochastic modern processing system. In addition, most proposed scheduling methods are not sufficiently scalable for managing real-world, large-scale processing systems. To address these limitations, we focus on the dynamic scheduling approach, which involves scheduling real-time events in large-scale modern manufacturing systems, from a data-driven perspective. We implement reinforcement learning (RL) to learn adaptive, scalable, and optimal dynamic scheduling policies, since RL can learn the underlying processing system's patterns and adaptively make allocation decisions based on real-time job and server measurements. The direct application of existing RL methods on the scheduling problem in such large-scale processing systems is impractical and undesired due to the extremely high computational complexity of learning a good scheduling policy. This thesis presents a practical and systematic computational framework that integrates RL with existing expert knowledge at three levels: (1) System-level planning. The planning procedure characterizes the processing system by the nominal feasible region of the scheduling problem. (2) Algorithm-level design. The design of the algorithm in RL is carefully selected as the index-policy-based, multi-agent RL, significantly reducing control policy search complexity. (3) Learning-level demonstration. During the learning process of RL, the existing expert knowledge is used as a demonstration to increase search efficiency and stabilize the RL learning process. We conduct various experiments in both real factory scenarios and simulated environments to evaluate the performance of the framework on processing system scheduling problems. The effectiveness of the proposed index-policy-based, multi-agent reinforcement learning (MARL) method is evidenced by its performance over traditional dynamic scheduling methods, with a linear computational time complexity in regard to the number of machines and job classes.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2019; ©2019
Publication date 2019; 2019
Issuance monographic
Language English

Creators/Contributors

Author Qu, Shuhui
Degree supervisor Leckie, James M
Thesis advisor Leckie, James M
Thesis advisor Law, K. H. (Kincho H.)
Thesis advisor Wang, Jie
Degree committee member Law, K. H. (Kincho H.)
Degree committee member Wang, Jie
Associated with Stanford University, Civil & Environmental Engineering Department.

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Shuhui Qu.
Note Submitted to the Department of Civil and Environmental Engineering.
Thesis Thesis Ph.D. Stanford University 2019.
Location electronic resource

Access conditions

Copyright
© 2019 by Shuhui Qu
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...