|dc.description.abstract||Autonomous driving holds the potential to increase human productivity, reduce accidents caused by human errors, allow better utilization of roads, reduce traffic accidents and congestion, free up parking space and provide many other advantages. Perception of Autonomous Vehicles (AV) refers to the use of sensors to perceive the world, e.g. using cameras to detect and classify objects. Traffic scene understanding is a key research problem in perception in autonomous driving, and semantic segmentation is a useful method to address this problem.
Adverse weather conditions are a reality that AV must contend with. Conditions like rain, snow, haze, etc. can drastically reduce visibility and thus affect computer vision models. Models for perception for AVs are currently designed for and tested on predominantly ideal weather conditions under good illumination. The most complete solution may be to have the segmentation networks be trained on all possible adverse conditions. Thus a dataset to train a segmentation network to make it robust to rain would need to have adequate data that cover these conditions well. Moreover, labeling is an expensive task. It is particularly expensive for semantic segmentation, as each object in a scene needs to be identified and each pixel annotated in the right class. Thus, the adverse weather is a challenging problem for perception models in AVs. This thesis explores the use of Generative Adversarial Networks (GAN) in order to improve semantic segmentation. We design a framework and a methodology to evaluate the proposed approach. The framework consists of an Adversity Detector, and a series of denoising filters. The Adversity Detector is an image classifier that takes as input clear weather or adverse weather scenes, and attempts to predict whether the given image contains rain, or puddles, or other conditions that can adversely affect semantic segmentation. The filters are denoising generative adversarial networks that are trained to remove the adverse conditions from images in order to translate the image to a domain the segmentation network has been trained on, i.e. clear weather images. We use the prediction from the Adversity Detector to choose which GAN filter to use. The methodology we devise for evaluating our approach uses the trained filters to output sets of images that we can then run segmentation tasks on. This, we argue, is a better metric for evaluating the GANs than similarity measures such as SSIM. We also use synthetic data so we can perform systematic evaluation of our technique.
We train two kinds of GANs, one that uses paired data (CycleGAN), and one that does not (Pix2Pix). We have concluded that GAN architectures that use unpaired data are not sufficiently good models for denoising. We train the denoising filters using the other architecture and we found them easy to train, and they show good results. While these filters do not show better performance than when we train our segmentation network with adverse weather data, we refer back to the point that training the segmentation network requires labelled data which is expensive to collect and annotate, particularly for adverse weather and lighting conditions. We implement our proposed framework and report a 17\% increase in performance in segmentation over the baseline results obtained when we do not use our framework.||en