Show simple item record

dc.contributor.authorMozifian, Melissa Farinaz 13:12:14 (GMT) 13:12:14 (GMT)
dc.description.abstractThis thesis focuses on advancing the state-of-the-art 3D object detection and localization in autonomous driving. An autonomous vehicle requires operating within a very unpredictable and dynamic environment. Hence a robust perception system is essential. This work proposes a novel architecture, AVOD, an 𝐀ggregate 𝐕iew 𝐎bject 𝐃etection architecture for autonomous driving capable of generating accurate 3D bounding boxes on road scenes. AVOD uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network. The proposed RPN uses a novel architecture capable of performing multimodal feature fusion on high resolution feature maps to generate reliable 3D object proposals for multiple object classes in road scenes. Using these proposals, the second stage detection network performs accurate oriented 3D bounding box regression and category classification to predict the extents, orientation, and classification of objects in 3D space. AVOD is differentiated from the state-of-the-art by using a high resolution feature extractor coupled with a multimodal fusion RPN architecture, and is therefore able to produce accurate region proposals for small classes in road scenes. AVOD also employs explicit orientation vector regression to resolve the ambiguous orientation estimate inferred from a bounding box. Experiments on the challenging KITTI dataset show the superiority of AVOD over the state-of-the-art detectors on the 3D localization, orientation estimation, and category classification tasks. Finally, AVOD is shown to run in real time and with a low memory overhead. The robustness of AVOD is also visually demonstrated when deployed on our autonomous vehicle operating under low lighting conditions such as night time as well as in snowy scenes. Furthermore, AVOD-SSD is proposed as a 3D Single Stage Detector. This work demonstrates how a single stage detector can achieve similar accuracy as that of a two-stage detector. An analysis of speed and accuracy trade-offs between AVOD and AVOD-SSD are presented.en
dc.publisherUniversity of Waterlooen
dc.subjectComputer Visionen
dc.subjectObject Detectionen
dc.subjectDeep Learningen
dc.titleReal-time 3D Object Detection for Autonomous Drivingen
dc.typeMaster Thesisen
dc.pendingfalse and Mechatronics Engineeringen Engineeringen of Waterlooen
uws-etd.degreeMaster of Applied Scienceen
uws.contributor.advisorWaslander, Steven
uws.contributor.affiliation1Faculty of Engineeringen

Files in this item


This item appears in the following Collection(s)

Show simple item record


University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages