3D Ground Truth Generation Using Pre-Trained Deep Neural Networks

Lee, Jungwook

3D Ground Truth Generation Using Pre-Trained Deep Neural Networks

dc.contributor.advisor	Waslander, Steven
dc.contributor.author	Lee, Jungwook
dc.date.accessioned	2019-05-24T19:59:20Z
dc.date.available	2019-05-24T19:59:20Z
dc.date.issued	2019-05-24
dc.date.submitted	2019-05-15
dc.description.abstract	Training 3D object detectors on publicly available data has been limited to small datasets due to the large amount of effort required to generate annotations. The difficulty of labeling in 3D using 2.5D sensors, such as LIDAR, is attributed to the high spatial reasoning skills required to deal with occlusion and partial viewpoints. Additionally, the current methods to label 3D objects are cognitively demanding due to frequent task switching. Reducing both task complexity and the amount of task switching done by annotators is key to reducing the effort and time required to generate 3D bounding box annotations. We therefore seek to reduce the burden on the annotators by leveraging existing 3D object detectors using deep neural networks. This work introduces a novel ground truth generation method that combines human supervision with pre-trained neural networks to generate per-instance 3D point cloud seg- mentation, 3D bounding boxes, and class annotations. The annotators provide object anchor clicks which behave as a seed to generate instance segmentation results in 3D. The points belonging to each instance are then used to regress object centroids, bounding box dimensions, and object orientation. The deep neural network model used to generate the segmentation masks and bounding box parameters is based on the PointNet architecture. We develop our approach with reliance on the KITTI dataset to analyze the quality of the generated ground truth. The neural network model is trained on KITTI training split and the 3D bounding box outputs are generated using annotation clicks collected from the validation split. The validation split of KITTI detection dataset contains 3712 frames of pointcloud and image scenes and it took 16.35 hours to label with the following method. Based on these results, our approach is 19 times faster than the latest published 3D object annotation scheme. Additionally, it is found that the annotators spent less time per object as the number of objects in the scenes increase, making it a very efficient for multi-object labeling. Furthermore, the quality of the generated 3D bounding boxes, using the labeling method, is compared against the KITTI ground truth. It is shown that the model performs on par with the current state-of-the-art 3D detectors and the labeling procedure does not negatively impact the output quality of the bounding boxes. Lastly, the proposed scheme is applied to previously unseen data from the Autonomoose self-driving vehicle to demonstrate generalization capabilities of the network.	en
dc.identifier.uri	http://hdl.handle.net/10012/14720
dc.language.iso	en	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.subject	Machine Learning	en
dc.subject	Computer Vision	en
dc.subject	Autonomous Driving	en
dc.subject	Deep Learning	en
dc.subject	Object Detection	en
dc.subject	Data Mining	en
dc.title	3D Ground Truth Generation Using Pre-Trained Deep Neural Networks	en
dc.type	Master Thesis	en
uws-etd.degree	Master of Applied Science	en
uws-etd.degree.department	Mechanical and Mechatronics Engineering	en
uws-etd.degree.discipline	Mechanical Engineering	en
uws-etd.degree.grantor	University of Waterloo	en
uws.contributor.advisor	Waslander, Steven
uws.contributor.affiliation1	Faculty of Engineering	en
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Lee_Jungwook.pdf
Size:: 17.59 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.08 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Mechanical and Mechatronics Engineering