The development of data-driven methods for modelling and optimisation of chemical process systems

  • Max Mowbray

Student thesis: Phd

Abstract

In this thesis, data driven approaches to sequential decision making problems within process systems engineering (PSE) are developed. Specifically, the use of model-free Reinforcement Learning (RL) is considered for process control, online optimisation, online production scheduling and supply chain management problems. Model-free RL methods are purely data-driven approaches to identifying an optimal control policy for an uncertain, decision process. These policies are identified independently of assumptions on system dynamics and associated uncertainties. Incentives for the use of RL and its challenges in application to PSE is presented in Chapter 2. These challenges include: improving the sample efficiency of policy identification; ensuring safety through satisfaction of operational constraints; as well as robustness via risk-sensitive decision making. A framework for the use of RL is also proposed, which all of the work items in this thesis adhere to. This framework relies on initial policy identification via simulation of an approximate, uncertain system model; and then subsequent transfer of the policy to the real system for online decision making and policy improvement. In Chapter 3, we explore how best to leverage existing process knowledge expressed by process operators and control schemes in the form of process data to aid offline policy learning. We propose a methodology to first extract a parameterisation of this knowledge in an offline simulation model in the form of a control policy. This removes the requirement to tune the policy manually and improves learning efficiency. This policy parameterisation is then transferred to the real process to provide online control and for subsequent policy improvement. A case study is provided via a tracking problem in a linear, uncertain dynamical system and existing data is provided by a proportional-integral-derivative controller. In Chapter 4, an entirely data driven methodology is proposed to ensure the probabilistic satisfaction of state constraints in online optimisation of uncertain, nonlinear process systems. The approach is benchmarked to a nonlinear model predictive control scheme on a lutein photo-production process, demonstrating improvements in constraint satisfaction and 30% improvements in the expected performance. In Chapter 5, a zero-order optimisation approach to RL policy identification is proposed for online scheduling of an uncertain sequential production environment. The approach is able to robustly handle common restrictions on these problems. Additionally, the framework inherits the benefits of posing risk-sensitive formulations, such as optimising for the conditional value-at-risk (CVaR). The method is benchmarked to an online mixed integer linear programming formulation and is demonstrated to be competitive with a performance gap of at most 5%, but identifying online decisions orders of magnitude more efficiently. Finally, in Chapter 6 we explore the application of a zero-order RL framework to an uncertain multi-echelon supply chain, inventory management problem. We benchmark the approach to a popular first-order RL method. We highlight the relative sample efficiency of our method and demonstrate improved performance in the objective. Benchmark is also provided to mathematical programming, with the method demonstrating competitive performance in the objective, but gaining the ability to incentivise worst-case performance by constraining the CVaR. As described, the work explores the development of RL methodologies and tailors them specifically to PSE problems. Additionally, many of the open challenges associated with the use of RL are addressed. Conclusive summary is provided in Chapter 7.
Date of Award1 Aug 2023
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorRobin Smith (Supervisor) & Dongda Zhang (Supervisor)

Keywords

  • Fed-batch process control
  • Reinforcement Learning
  • Process systems engineering
  • Optimal control
  • Production scheduling
  • Stochastic optimisation
  • Supply chain management

Cite this

'