Zero-Shot Monocular Motion Segmentation: A Fusion of Deep Learning and Geometric Approaches

Huang, Yuxiang

Zero-Shot Monocular Motion Segmentation: A Fusion of Deep Learning and Geometric Approaches

dc.contributor.advisor	Zelek, John
dc.contributor.author	Huang, Yuxiang
dc.date.accessioned	2024-04-29T17:16:53Z
dc.date.available	2024-04-29T17:16:53Z
dc.date.issued	2024-04-29
dc.date.submitted	2024-04-18
dc.description.abstract	Identifying and segmenting moving objects from a moving monocular camera is difficult when there is unknown camera motion, different types of object motions and complex scene structures. Deep learning methods achieve impressive results for generic motion segmentation, but they require massive training data and do not generalize well to novel scenes and objects. Conversely, recent geometric methods show promising results by fusing different geometric models together, but they require manually corrected point trajectories and cannot generate a coherent segmentation mask. This work proposes an innovative zero-shot motion segmentation approach that seamlessly combines the strengths of deep learning and geometric methods. The proposed method first generates object proposals for every video frame by using state-of-the-art foundation models, and then extracts different object-specific motion cues. Finally, the method uses multi-view spectral clustering to synergistically fuse different motion cues together to cluster objects into distinct motion groups, resulting in a coherent segmentation. The key contributions of this work are as follows: 1) Proposing the first zero-shot motion segmentation pipeline that performs dense motion segmentation on different scenes and object classes without any training. 2)This work is the first to combine epipolar geometry and optical flow-based motion models for motion segmentation. Multi-view spectral clustering is used to effectively combine different motion models to achieve good motion segmentation results in complex scenes Through extensive experimentation and comparative analysis, we validate the efficacy of the proposed method. Despite not being trained on any data, the method is able to achieve competitive results on real-world datasets, some of which are even better than those of the state-of-the-art motion segmentation methods trained in a supervised manner. This work not only contributes to the advancement of monocular motion segmentation, but also shows that combining different geometric motion models and motion cues is very important in analyzing the motions of objects.	en
dc.identifier.uri	http://hdl.handle.net/10012/20512
dc.language.iso	en	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.relation.uri	DAVIS-Moving	en
dc.relation.uri	YTVOS-Moving	en
dc.relation.uri	KT3DMoSeg	en
dc.relation.uri	KT3DInsMoSeg	en
dc.subject	computer vision	en
dc.subject	motion segmentation	en
dc.subject	monocular motion segmentation	en
dc.subject	video segmentation	en
dc.title	Zero-Shot Monocular Motion Segmentation: A Fusion of Deep Learning and Geometric Approaches	en
dc.type	Master Thesis	en
uws-etd.degree	Master of Applied Science	en
uws-etd.degree.department	Systems Design Engineering	en
uws-etd.degree.discipline	System Design Engineering	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.embargo.terms	0	en
uws.contributor.advisor	Zelek, John
uws.contributor.affiliation1	Faculty of Engineering	en
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Huang_Yuxiang.pdf
Size:: 11.45 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Systems Design Engineering