Zero-Shot Monocular Motion Segmentation: A Fusion of Deep Learning and Geometric Approaches

dc.contributor.authorHuang, Yuxiang
dc.date.accessioned2024-04-29T17:16:53Z
dc.date.available2024-04-29T17:16:53Z
dc.date.issued2024-04-29
dc.date.submitted2024-04-18
dc.description.abstractIdentifying and segmenting moving objects from a moving monocular camera is difficult when there is unknown camera motion, different types of object motions and complex scene structures. Deep learning methods achieve impressive results for generic motion segmentation, but they require massive training data and do not generalize well to novel scenes and objects. Conversely, recent geometric methods show promising results by fusing different geometric models together, but they require manually corrected point trajectories and cannot generate a coherent segmentation mask. This work proposes an innovative zero-shot motion segmentation approach that seamlessly combines the strengths of deep learning and geometric methods. The proposed method first generates object proposals for every video frame by using state-of-the-art foundation models, and then extracts different object-specific motion cues. Finally, the method uses multi-view spectral clustering to synergistically fuse different motion cues together to cluster objects into distinct motion groups, resulting in a coherent segmentation. The key contributions of this work are as follows: 1) Proposing the first zero-shot motion segmentation pipeline that performs dense motion segmentation on different scenes and object classes without any training. 2)This work is the first to combine epipolar geometry and optical flow-based motion models for motion segmentation. Multi-view spectral clustering is used to effectively combine different motion models to achieve good motion segmentation results in complex scenes Through extensive experimentation and comparative analysis, we validate the efficacy of the proposed method. Despite not being trained on any data, the method is able to achieve competitive results on real-world datasets, some of which are even better than those of the state-of-the-art motion segmentation methods trained in a supervised manner. This work not only contributes to the advancement of monocular motion segmentation, but also shows that combining different geometric motion models and motion cues is very important in analyzing the motions of objects.en
dc.identifier.urihttp://hdl.handle.net/10012/20512
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.relation.uriDAVIS-Movingen
dc.relation.uriYTVOS-Movingen
dc.relation.uriKT3DMoSegen
dc.relation.uriKT3DInsMoSegen
dc.subjectcomputer visionen
dc.subjectmotion segmentationen
dc.subjectmonocular motion segmentationen
dc.subjectvideo segmentationen
dc.titleZero-Shot Monocular Motion Segmentation: A Fusion of Deep Learning and Geometric Approachesen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Applied Scienceen
uws-etd.degree.departmentSystems Design Engineeringen
uws-etd.degree.disciplineSystem Design Engineeringen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0en
uws.contributor.advisorZelek, John
uws.contributor.affiliation1Faculty of Engineeringen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Huang_Yuxiang.pdf
Size:
11.45 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: