Show simple item record

dc.contributor.authorSucholutsky, Ilia
dc.date.accessioned2021-06-15 18:16:46 (GMT)
dc.date.available2021-06-15 18:16:46 (GMT)
dc.date.issued2021-06-15
dc.date.submitted2021-06-09
dc.identifier.urihttp://hdl.handle.net/10012/17103
dc.description.abstractThe tremendous recent growth in the fields of artificial intelligence and machine learning has largely been tied to the availability of big data and massive amounts of compute. The increasingly popular approach of training large neural networks on large datasets has provided great returns, but it leaves behind the multitude of researchers, companies, and practitioners who do not have access to sufficient funding, compute power, or volume of data. This thesis aims to rectify this growing imbalance by probing the limits of what machine learning and deep learning methods can achieve with small data. What knowledge does a dataset contain? At the highest level, a dataset is just a collection of samples: images, text, etc. Yet somehow, when we train models on these datasets, they are able to find patterns, make inferences, detect similarities, and otherwise generalize to samples that they have previously never seen. This suggests that datasets may contain some kind of intrinsic knowledge about the systems or distributions from which they are sampled. Moreover, it appears that this knowledge is somehow distributed and duplicated across the samples; we intuitively expect that removing an image from a large training set will have virtually no impact on the final model performance. We develop a framework to explain efficient generalization around three principles: information sharing, information repackaging, and information injection. We use this framework to propose `less than one'-shot learning, an extreme form of few-shot learning where a learner must recognize N classes from M < N training examples. To achieve this extreme level of efficiency, we develop new framework-consistent methods and theory for lost data restoration, for dataset size reduction, and for few-shot learning with deep neural networks and other popular machine learning models.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.relation.urihttps://github.com/ilia10000/dataset-distillationen
dc.relation.urihttps://github.com/ilia10000/LO-Shoten
dc.subjectdeep learningen
dc.subjectmachine learningen
dc.subjectfew-shot learningen
dc.subjectone-shot learningen
dc.subjectdataset reductionen
dc.subjectdataset distillationen
dc.subjectsmall dataen
dc.subjectMLen
dc.subjectAIen
dc.subjectartificial intelligenceen
dc.subjectneural networksen
dc.subjectNLPen
dc.subjectcomputer visionen
dc.subjectoptimizationen
dc.subjectLO-shot learningen
dc.titleLearning From Almost No Dataen
dc.typeDoctoral Thesisen
dc.pendingfalse
uws-etd.degree.departmentStatistics and Actuarial Scienceen
uws-etd.degree.disciplineStatisticsen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeDoctor of Philosophyen
uws-etd.embargo.terms0en
uws.contributor.advisorSchonlau, Matthias
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages