Application of Deep Learning in Chemical Processes: Explainability, Monitoring and Observability

Agarwal, Piyush

Application of Deep Learning in Chemical Processes: Explainability, Monitoring and Observability

Files

Agarwal_Piyush.pdf (6.63 MB)

Date

2022-01-04

Authors

Agarwal, Piyush

Advisor

Budman, Hector

Publisher

University of Waterloo

Abstract

The last decade has seen remarkable advances in speech, image, and language recognition tools that have been made available to the public through computer and mobile devices’ applications. Most of these significant improvements were achieved by Artificial Intelligence (AI)/ deep learning (DL) algorithms (Hinton et al., 2006) that generally refers to a set of novel neural network architectures and algorithms such as long-short term memory (LSTM) units, convolutional networks (CNN), autoencoders (AE), t-distributed stochastic embedding (TSNE), etc. Although neural networks are not new, due to a combination of relatively novel improvements in methods for training the networks and the availability of increasingly powerful computers, one can now model much more complex nonlinear dynamic behaviour by using complex structures of neurons, i.e. more layers of neurons, than ever before (Goodfellow et al., 2016). However, it is recognized that the training of neural nets of such complex structures requires a vast amount of data. In this sense manufacturing processes are good candidates for deep learning applications since they utilize computers and information systems for monitoring and control thus generating a massive amount of data. This is especially true in pharmaceutical companies such as Sanofi Pasteur, the industrial collaborator for the current study, where large data sets are routinely stored for monitoring and regulatory purposes. Although novel DL algorithms have been applied with great success in image analysis, speech recognition, and language translation, their applications to chemical processes and pharmaceutical processes, in particular, are scarce. The current work deals with the investigation of deep learning in process systems engineering for three main areas of application: (i) Developing a deep learning classification model for profit-based operating regions. (ii) Developing both supervised and unsupervised process monitoring algorithms. (iii) Observability Analysis It is recognized that most empirical or black-box models, including DL models, have good generalization capabilities but are difficult to interpret. For example, using these methods it is difficult to understand how a particular decision is made, which input variable/feature is greatly influencing the decision made by the DL models etc. This understanding is expected to shed light on why biased results can be obtained or why a wrong class is predicted with a higher probability in classification problems. Hence, a key goal of the current work is on deriving process insights from DL models. To this end, the work proposes both supervised and unsupervised learning approaches to identify regions of process inputs that result in corresponding regions, i.e. ranges of values, of process profit. Furthermore, it will be shown that the ability to better interpret the model by identifying inputs that are most informative can be used to reduce over-fitting. To this end, a neural network (NN) pruning algorithm is developed that provides important physical insights on the system regarding the inputs that have positive and negative effect on profit function and to detect significant changes in process phenomenon. It is shown that pruning of input variables significantly reduces the number of parameters to be estimated and improves the classification test accuracy for both case studies: the Tennessee Eastman Process (TEP) and an industrial vaccine manufacturing process. The ability to store a large amount of data has permitted the use of deep learning (DL) and optimization algorithms for the process industries. In order to meet high levels of product quality, efficiency, and reliability, a process monitoring system is needed. The two aspects of Statistical Process Control (SPC) are fault detection and diagnosis (FDD). Many multivariate statistical methods like PCA and PLS and their dynamic variants have been extensively used for FD. However, the inherent non-linearities in the process pose challenges while using these linear models. Numerous deep learning FDD approaches have also been developed in the literature. However, the contribution plots for identifying the root cause of the fault have not been derived from Deep Neural Networks (DNNs). To this end, the supervised fault detection problem in the current work is formulated as a binary classification problem while the supervised fault diagnosis problem is formulated as a multi-class classification problem to identify the type of fault. Then, the application of the concept of explainability of DNNs is explored with its particular application in FDD problem. The developed methodology is demonstrated on TEP with non-incipient faults. Incipient faults are faulty conditions where signal to noise ratio is small and have not been widely studied in the literature. To address the same, a hierarchical dynamic deep learning algorithm is developed specifically to address the issue of fault detection and diagnosis of incipient faults. One of the major drawbacks of both the methods described above is the availability of labeled data i.e. normal operation and faulty operation data. From an industrial point of view, most data in an industrial setting, especially for biochemical processes, is obtained during normal operation and faulty data may not be available or may be insufficient. Hence, we also develop an unsupervised DL approach for process monitoring. It involves a novel objective function and a NN architecture that is tailored to detect the faults effectively. The idea is to learn the distribution of normal operation data to differentiate among the fault conditions. In order to demonstrate the advantages of the proposed methodology for fault detection, systematic comparisons are conducted with Multiway Principal Component Analysis (MPCA) and Multiway Partial Least Squares (MPLS) on an industrial scale Penicillin Simulator. Past investigations reported that the variability in productivity in the Sanofi's Pertussis Vaccine Manufacturing process may be highly correlated to biological phenomena, i.e. oxidative stresses, that are not routinely monitored by the company. While the company monitors and stores a large amount of fermentation data it may not be sufficiently informative about the underlying phenomena affecting the level of productivity. Furthermore, since the addition of new sensors in pharmaceutical processes requires extensive and expensive validation and certification procedures, it is very important to assess the potential ability of a sensor to observe relevant phenomena before its actual adoption in the manufacturing environment. This motivates the study of the observability of the phenomena from available data. An algorithm is proposed to check the observability for the classification task from the observed data (measurements). The proposed methodology makes use of a Supervised AE to reduce the dimensionality of the inputs. Thereafter, a criterion on the distance between the samples is used to calculate the percentage of overlap between the defined classes. The proposed algorithm is tested on the benchmark Tennessee Eastman process and then applied to the industrial vaccine manufacturing process.