UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Deep deterministic policy gradient: applications in process control and integrated process design and control

Loading...
Thumbnail Image

Date

2022-06-20

Authors

Mendiola Rodriguez, Tannia Argelia

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

In recent years, the urgent need to develop sustainable processes to fight the negative effects of climate change has gained global attention and has led to the transition into renewable energies. As renewable sources present a complex dynamic behavior, this has motivated a search of new ways to simulate and optimize processes more efficiently. One emerging area that has recently been explored is Reinforcement learning (RL), which has shown promising results for different chemical engineering applications. Although recent studies on RL applied to chemical engineering applications have been performed in different areas such as process design, scheduling, and dynamic optimization, there is a need to explore further these applications to determine their technical feasibility and potential implementation in the chemical and manufacturing sectors. An emerging area of opportunity to consider is biological systems, such as Anaerobic Digestion Systems (AD). These systems are not only able to reduce waste from wastewater, but they can also produce biogas, which is an attractive source of renewable energy. The aim of this work is to test the feasibility of a RL algorithm referred to as Deep Deterministic Policy Gradient (DDPG) to two typical areas of process operations in chemical engineering, i.e., process control and process design and control. Parametric uncertainty and disturbances are considered in both approaches (i.e., process control and integration of process and control design). The motivation in using this algorithm is due to its ability to consider stochastic features, which can be interpreted as plant-model mismatch, which is needed to represent realistic operations of processes. In the first part of this work, the DDPG algorithm is used to seek for open-loop control actions that optimize an AD system treating Tequila vinasses under the effects of parametric uncertainty and disturbances. To provide a further insight, two different AD configurations (i.e., a single-stage and a two-stage system) are considered and compared under different scenarios. The results showed that the proposed methodology was able to learn an optimal policy, i.e., the control actions to minimize the organic content of Tequila in the effluents while producing biogas. However, further improvements are necessary to implement this DDPG-based methodology for online large-scale applications, e.g., reduce the computational costs. The second part of this study focuses on the development of a methodology to address the integration of process design and control for AD systems. The objective is to optimize an economic function with the aim of finding an optimal design while taking into account the controllability of the process. Some key aspects of this methodology are the consideration of stochastic disturbances and the ability to combine time-dependent and time-independent actions in the DDPG. The same two different reactor configurations considered in the optimal control study were explored and compared in this approach. To account for constraints, a penalty function was considered in the formulation of the economic function. The results showed that there are different advantages and limitations for each AD system. The two-stage system required a larger investment in capital costs in exchange of higher amounts of biogas being produced from this design. On the other hand, the single-stage AD system required less investment in capital costs in exchange of producing less biogas and therefore lower profits than the two-stage system. Overall, the DDPG was able to learn new control paths and optimal designs simultaneously thus making it an attractive method to address the integrated design and control of chemical systems subject to stochastic disturbances and parametric uncertainty.  

Description

Keywords

reinforcement learning, anaerobic digestion, machine learning, Deep deterministic policy gradient, Tequila vinasses, robust control, process design and control

LC Keywords

Citation