Learning From Almost No Data
MetadataShow full item record
The tremendous recent growth in the fields of artificial intelligence and machine learning has largely been tied to the availability of big data and massive amounts of compute. The increasingly popular approach of training large neural networks on large datasets has provided great returns, but it leaves behind the multitude of researchers, companies, and practitioners who do not have access to sufficient funding, compute power, or volume of data. This thesis aims to rectify this growing imbalance by probing the limits of what machine learning and deep learning methods can achieve with small data. What knowledge does a dataset contain? At the highest level, a dataset is just a collection of samples: images, text, etc. Yet somehow, when we train models on these datasets, they are able to find patterns, make inferences, detect similarities, and otherwise generalize to samples that they have previously never seen. This suggests that datasets may contain some kind of intrinsic knowledge about the systems or distributions from which they are sampled. Moreover, it appears that this knowledge is somehow distributed and duplicated across the samples; we intuitively expect that removing an image from a large training set will have virtually no impact on the final model performance. We develop a framework to explain efficient generalization around three principles: information sharing, information repackaging, and information injection. We use this framework to propose `less than one'-shot learning, an extreme form of few-shot learning where a learner must recognize N classes from M < N training examples. To achieve this extreme level of efficiency, we develop new framework-consistent methods and theory for lost data restoration, for dataset size reduction, and for few-shot learning with deep neural networks and other popular machine learning models.
Cite this version of the work
Ilia Sucholutsky (2021). Learning From Almost No Data. UWSpace. http://hdl.handle.net/10012/17103
Showing items related by title, author, creator and subject.
Vandenhof, Colin (University of Waterloo, 2020-05-15)Reinforcement learning (RL) is a powerful tool for developing intelligent agents, and the use of neural networks makes RL techniques more scalable to challenging real-world applications, from task-oriented dialogue systems ...
Song, Haobei (University of Waterloo, 2019-09-12)The exploration/exploitation dilemma is a fundamental but often computationally intractable problem in reinforcement learning. The dilemma also impacts data efficiency which can be pivotal when the interactions between the ...
Ganapathi Subramanian, Sriram (University of Waterloo, 2022-07-15)Multi-agent reinforcement learning (MARL) has seen much success in the past decade. However, these methods are yet to find wide application in large-scale real world problems due to two important reasons. First, MARL ...