SMaT-HSI: Structure-aware Mamba-Transformer Hybrid Model for Hyperspectral Image Classification

dc.contributor.authorLiu, Yaxuan
dc.date.accessioned2025-09-11T17:13:29Z
dc.date.available2025-09-11T17:13:29Z
dc.date.issued2025-09-11
dc.date.submitted2025-09-09
dc.description.abstractHyperspectral image (HSI) classification is a crucial task in remote sensing, playing a fundamental role in environmental monitoring, precision agriculture, urban planning, and mineral exploration. By leveraging the rich spectral information across hundreds of contiguous bands, HSI classification enables precise identification of materials and land cover types, facilitating accurate mapping of vegetation, soil, water bodies, and built environments. Traditional convolutional neural network (CNN)-based methods effectively extract local spatial features, while transformer-based models excel in capturing global contextual dependencies. However, both approaches face challenges in fully leveraging the spectral and spatial dependencies inherent in hyperspectral data. Recently, Mamba, a state-space model (SSM)-based architecture, has shown promise in sequence modeling by efficiently capturing long-range dependencies with linear computational complexity. A comprehensive comparison of CNN-based, transformer-based, and Mamba-based models for HSI classification reveals that Mamba-based models achieve performance comparable to transformer-based models, highlighting their potential in this domain. Current Mamba-based methods often convert images into one-dimensional sequences and use scanning strategies to capture local spatial and spectral dependencies. However, these approaches struggle to fully represent the intricate spectral-spatial structures in HSIs and introduce computational redundancy. To address this, a structure-aware state fusion mechanism is proposed to explicitly model the spatial and spectral relationships of neighboring features in the latent state space, enabling more efficient and accurate representation learning. To further improve the capture of global context and long-range spatial dependencies, a hybrid Mamba-transformer architecture is explored. Different integration strategies are investigated, including inserting transformer blocks in the earlier, middle, and final layers, as well as at regular intervals. Analysis indicates that incorporating a self-attention block in the final layer achieves the highest average overall accuracy of 97.58% across the five datasets. The proposed approach is evaluated on five publicly available benchmark datasets—IndianPines, Pavia University, Houston 2013, WHU-Hi-HanChuan, and WHU-Hi-HongHu, demonstrating an average overall accuracy improvement of 0.87% compared to the baseline model and competitive results with existing transformer-based and Mamba-based models. These findings underscore the potential of combining Mamba and transformer architectures for efficient and accurate hyperspectral image classification, offering new insights into advanced sequence modeling for remote sensing applications.
dc.identifier.urihttps://hdl.handle.net/10012/22383
dc.language.isoen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjecthyperspectral image classification
dc.subjectdeep learning
dc.subjectmamba
dc.subjectspectral-spatial learning
dc.titleSMaT-HSI: Structure-aware Mamba-Transformer Hybrid Model for Hyperspectral Image Classification
dc.typeMaster Thesis
uws-etd.degreeMaster of Applied Science
uws-etd.degree.departmentSystems Design Engineering
uws-etd.degree.disciplineSystem Design Engineering
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0
uws.contributor.advisorClausi, David
uws.contributor.advisorXu, Linlin
uws.contributor.affiliation1Faculty of Engineering
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Liu_Yaxuan.pdf
Size:
13.12 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: