Evaluation of Machine Learning Algorithms for Lake Ice Classification from Optical Remote Sensing Data

Wu, Yuhao

Evaluation of Machine Learning Algorithms for Lake Ice Classification from Optical Remote Sensing Data

Files

Wu_Yuhao.pdf (3.31 MB)

Date

2020-06-02

Authors

Wu, Yuhao

Advisor

Duguay, Claude

Publisher

University of Waterloo

Abstract

The topic of lake ice cover mapping from satellite remote sensing data has gained interest in recent years since it allows the extent of lake ice and the dynamics of ice phenology over large areas to be monitored. Mapping lake ice extent can record the loss of the perennial ice cover for lakes located in the High Arctic. Moreover, ice phenology dates, retrieved from lake ice maps, are useful for assessing long-term trends and variability in climate, particularly due to their sensitivity to changes in near-surface air temperature. However, existing knowledge-driven (threshold-based) retrieval algorithms for lake ice-water classification that use top-of-the-atmosphere (TOA) reflectance products do not perform well under the condition of large solar zenith angles, resulting in low TOA reflectance. Machine learning (ML) techniques have received considerable attention in the remote sensing field for the past several decades, but they have not yet been applied in lake ice classification from optical remote sensing imagery. Therefore, this research has evaluated the capability of ML classifiers to enhance lake ice mapping using multispectral optical remote sensing data (MODIS L1B (TOA) product). Chapter 3, the main manuscript of this thesis, presents an investigation of four ML classifiers (i.e. multinomial logistic regression, MLR; support vector machine, SVM; random forest, RF; gradient boosting trees, GBT) in lake ice classification. Results are reported using 17 lakes located in the Northern Hemisphere, which represent different characteristics regarding area, altitude, freezing frequency, and ice cover duration. According to the overall accuracy assessment using a random k-fold cross-validation (k = 100), all ML classifiers were able to produce classification accuracies above 94%, and RF and GBT provided above 98% classification accuracies. Moreover, the RF and GBT algorithms provided a more visually accurate depiction of lake ice cover under challenging conditions (i.e., high solar zenith angles, black ice, and thin cloud cover). The two tree-based classifiers were found to provide the most robust spatial transferability over the 17 lakes and performed consistently well across three ice seasons, better than the other classifiers. Moreover, RF was insensitive to the choice of the hyperparameters compared to the other three classifiers. The results demonstrate that RF and GBT provide a great potential to map accurately lake ice cover globally over a long time-series. Additionally, a case study applying a convolution neural network (CNN) model for ice classification in Great Slave Lake, Canada is presented in Appendix A. Eighteen images acquired during the the ice season of 2009-2010 were used in this study. The proposed CNN produced a 98.03% accuracy with the testing dataset; however, the accuracy dropped to 90.13% using an independent (out-of-sample) validation dataset. Results show the powerful learning performance of the proposed CNN with the testing data accuracy obtained. At the same time, the accuracy reduction of the validation dataset indicates the overfitting behavior of the proposed model. A follow-up investigation would be needed to improve its performance. This thesis investigated the capability of ML algorithms (both pixel-based and spatial-based) in lake ice classification from the MODIS L1B product. Overall, ML techniques showed promising performances for lake ice cover mapping from the optical remote sensing data. The tree-based classifiers (pixel-based) exhibited the potential to produce accurate lake ice classification at a large-scale over long time-series. In addition, more work would be of benefit for improving the application of CNN in lake ice cover mapping from optical remote sensing imagery.