Learning From Almost No Data

dc.contributor.authorSucholutsky, Ilia
dc.date.accessioned2021-06-15T18:16:46Z
dc.date.available2021-06-15T18:16:46Z
dc.date.issued2021-06-15
dc.date.submitted2021-06-09
dc.description.abstractThe tremendous recent growth in the fields of artificial intelligence and machine learning has largely been tied to the availability of big data and massive amounts of compute. The increasingly popular approach of training large neural networks on large datasets has provided great returns, but it leaves behind the multitude of researchers, companies, and practitioners who do not have access to sufficient funding, compute power, or volume of data. This thesis aims to rectify this growing imbalance by probing the limits of what machine learning and deep learning methods can achieve with small data. What knowledge does a dataset contain? At the highest level, a dataset is just a collection of samples: images, text, etc. Yet somehow, when we train models on these datasets, they are able to find patterns, make inferences, detect similarities, and otherwise generalize to samples that they have previously never seen. This suggests that datasets may contain some kind of intrinsic knowledge about the systems or distributions from which they are sampled. Moreover, it appears that this knowledge is somehow distributed and duplicated across the samples; we intuitively expect that removing an image from a large training set will have virtually no impact on the final model performance. We develop a framework to explain efficient generalization around three principles: information sharing, information repackaging, and information injection. We use this framework to propose `less than one'-shot learning, an extreme form of few-shot learning where a learner must recognize N classes from M < N training examples. To achieve this extreme level of efficiency, we develop new framework-consistent methods and theory for lost data restoration, for dataset size reduction, and for few-shot learning with deep neural networks and other popular machine learning models.en
dc.identifier.urihttp://hdl.handle.net/10012/17103
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.relation.urihttps://github.com/ilia10000/dataset-distillationen
dc.relation.urihttps://github.com/ilia10000/LO-Shoten
dc.subjectdeep learningen
dc.subjectmachine learningen
dc.subjectfew-shot learningen
dc.subjectone-shot learningen
dc.subjectdataset reductionen
dc.subjectdataset distillationen
dc.subjectsmall dataen
dc.subjectMLen
dc.subjectAIen
dc.subjectartificial intelligenceen
dc.subjectneural networksen
dc.subjectNLPen
dc.subjectcomputer visionen
dc.subjectoptimizationen
dc.subjectLO-shot learningen
dc.titleLearning From Almost No Dataen
dc.typeDoctoral Thesisen
uws-etd.degreeDoctor of Philosophyen
uws-etd.degree.departmentStatistics and Actuarial Scienceen
uws-etd.degree.disciplineStatisticsen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0en
uws.contributor.advisorSchonlau, Matthias
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Sucholutsky_Ilia.pdf
Size:
42.44 MB
Format:
Adobe Portable Document Format
Description:
PhD thesis
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: