Application of Deep Learning in Pharmaceutical Processes: Monitoring, Diagnosis and Modeling
Loading...
Date
2024-09-24
Authors
Advisor
Budman, Hector
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
In recent years, artificial intelligence (AI) has emerged as a transformative technology in various fields, including pharmaceutical processes. The integration of AI in pharmaceutical manufacturing offers significant potential for enhancing process monitoring, fault detection, and predictive modeling. By leveraging advanced machine learning and deep learning algorithms, AI enables the development of detailed models that can handle the complexities and nonlinearities of biopharmaceutical processes. These models facilitate real-time monitoring, allowing for the early detection of process deviations and faults, which is crucial for maintaining product quality and efficiency. Additionally, AI-driven modeling approaches provide deeper insights into the dynamic behavior of biochemical pathways, offering valuable tools for optimizing production and ensuring regulatory compliance.
The current work investigates the application of machine learning/deep learning algorithms in chemical engineering processes, with a particular focus on bio-pharmaceutical processes. Due to tight regulations, pharmaceutical operations are operated within a small range of conditions, resulting in an insufficient amount of faulty data for training a supervised process monitoring model. Furthermore, critical information about faults, such as the type of fault, the time of fault occurrence, and the magnitude of the fault, is often unavailable in bio-processes due to limited availability of informative online sensors. This lack of detailed fault data makes it impractical to train supervised models for fault detection and diagnosis, as these models require comprehensive and accurately labeled datasets to function effectively. Consequently, developing robust and reliable unsupervised models becomes essential for effective process monitoring and fault diagnosis in these scenarios. To this end, this work focuses on unsupervised learning based algorithms that do not require labeled data and instead they identify deviations from normal operating conditions.
Within the framework of unsupervised learning the following aspects are tackled:
i- An unsupervised PLS-AE (Partial Least Squares-Autoencoder) model is proposed for online batch process monitoring. Detection by this algorithm is based on a novel loss function for training which focuses on maximizing the fault detection rate. This results in more accurate process monitoring models compared to those trained with the classical loss function of minimizing the mean square of reconstruction error. It will be shown that the PLS-AE model trained with the novel loss function and utilizing dynamic upper control limits can surpass other models, including PCA, PLS, and PLS-AE models.
ii- Following the development of the PLS-AE model approach this thesis further develops unsupervised deep LSTM-AE (Long Short Term Memory Autoencoder) models which are more suitable for capturing dynamic nonlinear patterns. These models utilize the novel loss function and dynamic upper control limits, enhancing their fault detection capabilities. The effectiveness of these LSTM-AE models is demonstrated through rigorous testing on three case studies: the penicillin batch fermenter simulator, the real-world whooping cough vaccine manufacturing process at Sanofi Toronto, and the lab-scale diphtheria process at Sanofi Toronto.
iii- To address the interpretability of deep learning models, this work proposes two novel algorithms for generating contribution plots for $H^2$ and $SPE$ statistics, enabling the identification of the root causes of faults. This advancement significantly improves the interpretability and diagnostic capabilities of autoencoder-based process monitoring models, making them more practical and effective for real-world applications. Furthermore, due to the unavailability of fault information in the real-world Sanofi processes, this work proposes a new metric based on the amount of violation of process monitoring metrics compared to their upper control limits to assess the effectiveness of process monitoring models. The new proposed metric can be utilized to detect low productivity batches (faulty batches) before the end of the fermentation.
iv- To address over-fitting, hybrid models are developed for process monitoring and modeling. For the process monitoring task, this work proposes a novel hybrid framework that combines deep LSTM-AE models with exact equations of process controllers. In this framework, manipulated variables are excluded from the input and output layers of the autoencoder. To preserve their data in the process monitoring model, they are reconstructed through the controller equations fed by the reconstructed controlled variables. Removing manipulated variables from the input and output layers of the autoencoder results in a neural network model with fewer parameters to tune compared to non-hybrid autoencoders that include manipulated variables in the input and output layers. The proposed hybrid LSTM-AE models are tested on two case studies: the penicillin batch fermenter simulator and the Tennessee Eastman process under a decentralized control strategy. The proposed hybrid LSTM-AE models not only reduce the size of the model and the risk of overfitting but also achieve higher process monitoring efficacy by utilizing perfect models of controllers.
v- To further tackle the problem of over-fitting, this work proposes a novel modeling approach named Metabolic Graph Neural Network (MGNN). This approach imposes a priori knowledge about the metabolic pathways of a microorganism onto the architecture of multi-layer neural network models to represent the dynamic behavior of the cell culture. To capture the nonlinear dynamic behavior of each metabolite, a sub-neural network is considered. The input layer of each sub-neural network model is designed based on the reactions in which the metabolite is involved. This results in a smaller neural network where not all layers and units are connected, thereby reducing the risk of overfitting. The proposed hybrid modeling framework is tested on the metabolic pathway of oxidative stress in Bordetella pertussis bacteria cells. By using the a priori known metabolic network, the proposed MGNN model effectively reduces the overfitting issue compared to a fully connected network that does not use metabolic network knowledge. The MGNN exhibits a superior fit for both training and testing datasets. Additionally, the proposed MGNN is highly interpretable, as it efficiently computes the relevance of each metabolite on any other metabolite by applying gradient computation and back-propagation operations to the neural network. The model is also shown to be useful for fault detection.
In summary, by tackling typical aspects of bio-process modelling including lack of measurements, nonlinear process dynamics and data over-fitting, this work advances the application of deep learning algorithms for monitoring of industrial bio-processes.
Description
Keywords
deep learning, pharmaceutical processes, fault detection and diagnosis, batch process monitoring, hybrid models, metabolic networks, machine learning