The Libraries will be performing routine maintenance on UWSpace on July 15th-16th, 2025. UWSpace will be available, though users may experience service lags during this time. We recommend all users avoid submitting new items to UWSpace until maintenance is completed.
 

Closing the Modelling Gap: Transfer Learning from a Low-Fidelity Simulator for Autonomous Driving

dc.contributor.advisorCzarnecki, Krzysztof
dc.contributor.authorBalakrishnan, Aravind
dc.date.accessioned2020-01-24T16:51:41Z
dc.date.available2020-01-24T16:51:41Z
dc.date.issued2020-01-24
dc.date.submitted2020-01-21
dc.description.abstractThe behaviour planning subsystem, which is responsible for high-level decision making and planning, is an important aspect of an autonomous driving system. There are advantages to using a learned behaviour planning system instead of traditional rule-based approaches. However, high quality labelled data for training behaviour planning models is hard to acquire. Thus, reinforcement learning (RL), which can learn a policy from simulations, is a viable option for this problem. However, modelling inaccuracies between the simulator and the target environment, called the ‘transfer gap’, hinders its deployment in a real autonomous vehicle. High-fidelity simulators, which have a smaller transfer gap, come with large computational costs that are not favourable for RL training. Therefore, we often have to settle for a fast, but lower fidelity simulator that exacerbates the transfer learning problem. In this thesis, we study how a low-fidelity 2D simulator can be used in place of a slower 3D simulator for training RL behaviour planning models, and analyze the resulting policies in comparison with a rule-based approach. We develop WiseMove, an RL framework for autonomous driving research that supports hierarchical RL, to serve as the low-fidelity source simulator. A transfer learning scenario is set up from WiseMove to an Unreal-based simulator for the Autonomoose system to study and close the transfer gap. We find that perception errors in the target simulator contribute the most to the transfer gap. These errors, when naively modelled in WiseMove, provide a policy that performs better in the target simulator than a carefully constructed rule-based policy. Applying domain randomization on the environment yields an even better policy. The final RL policy reduces the failures due to perception errors from 10% to 2.75%. We also observe that the RL policy has less reliance on the velocity compared to the rule-based algorithm, as its measurement is unreliable in the target simulator. To understand the exact learned behaviour, we also distill the RL policy using a decision tree to obtain an interpretable rule-based policy. We show that constructing a rule-based policy manually to efficiently handle perception errors is not trivial. Future work can explore more driving scenarios under fewer constraints to further validate this result.en
dc.identifier.urihttp://hdl.handle.net/10012/15570
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectreinforcement learningen
dc.subjectautonomous drivingen
dc.subjecttransfer learningen
dc.subject.lcshAutomated vehiclesen
dc.subject.lcshMachine learningen
dc.titleClosing the Modelling Gap: Transfer Learning from a Low-Fidelity Simulator for Autonomous Drivingen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws.comment.hiddenI used the UW thesis template in Overleaf to create the thesis.en
uws.contributor.advisorCzarnecki, Krzysztof
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Balakrishnan_Aravind.pdf
Size:
3.04 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: