Versatile Deep Learning Forecasting Application with Metamorphic Quality Assurance
MetadataShow full item record
Accurate estimates of fresh produce (FP) yields and prices are crucial for having fair bidding prices by retailers along with informed asking prices by farmers, leading to the best prices for customers. To have accurate estimates, the state-of-the-art deep learning (DL) models for forecasting FP yields and prices, including both station-based and satellite based models, are improved in this thesis by providing a new deep learning model structure. The scope of this work covers forecasting a horizon of 5 weeks ahead for the fresh produce yields and prices. The proposed structure is built using an ensemble of Attention Deep Feedforward Neural Network with Gated Recurrent Units (ADGRU) and Deep Feedforward Neural Network with embedded GRU units (DFNNGRU); (DFNNGRU-ADGRU ENS). The station-based version of the ensemble is trained and tested using as input the soil moisture and temperature parameters retrieved from land stations. This station-based ensemble model is found to outperform the literature model by 24% improvement in the AGM score for yield forecasting and 37.5% for price forecasting. For the satellite-based model, the best satellite image preprocessing technique must be found to represent the images with less data for efficiency. Therefore, a preprocessing approach based on averaging is proposed and implemented then compared with the literature approach, which is based on histograms, where the proposed approach improves performance by 20%. The proposed Deep Feed Forward Neural Network with Embedded Gated Recurrent Units (DFNNGRU) ensembled with Attention Deep GRUs (ADGRU) is then tested against well-performing models of Stacked-AutoEncoder (SAE) ensembled with Convolution Neural Networks with Long-short term memory (CNNLSTM), where the proposed model is found to outperform the literature model by 12.5%. In addition, interpolation techniques are used to estimate the missing VIs values due to the low frequency of capturing the satellite images by Landsat. A comparative analysis is conducted to choose the most effective technique, which is found to be Cubic Spline interpolation. The effect of adding the VIs as input parameters on the forecasting performance of the deep learning model is assessed and the most effective VIs are selected. One VI, which is the Normalized Difference Vegetation Index (NDVI), proves to be the most effective index in forecasting yield with an enhancement of 12.5% in AGM score. A novel transfer learning (TL) framework is proposed for better generalizability. After finding the best DL forecasting model, a TL framework is proposed to enhance that model generalization to other FPs by using FP similarity, clustering, and TL techniques customized to fit the problem in hand. Furthermore, the similarity algorithms found in literature are improved by considering the time series features rather than the absolute values of their points. In addition, the FPs are clustered using a hierarchical clustering technique utilizing the complete linkage of a dendrogram to automate the process of finding the similarity thresholds and avoid setting them arbitrarily. Finally, the transfer learning is applied by freezing some layers of the proposed ensemble model and fine-tuning the rest leading to significant improvement in AGM compared to the best literature model. Finally, a forecasting application is implemented to facilitate the use of the proposed models by the end users through a friendly interface. For testing the quality of the application deployed code and models, metamorphic testing is applied to assess the effectiveness of the machine learning models while machine learning is used to automatically detect the main metamorphic relations in the software code. The interactive role played by metamorphic testing and machine learning is investigated through the quality assurance of the forecasting application. The datasets used to train and test the deep learning forecasting models as well as the forecasting models are verified using metamorphic tests and the metamorphic relations in the generalization code are automatically detected using Support Vector Machine (SVM) models. Testing has revealed the unmatched requirements that are fixed to bring forward a valid application with sound data, effective models, and valid generalization code.
Cite this version of the work
Islam Mahmoud (2023). Versatile Deep Learning Forecasting Application with Metamorphic Quality Assurance. UWSpace. http://hdl.handle.net/10012/19746