Issues in Computer Vision Data Collection: Bias, Consent, and Label Taxonomy
MetadataShow full item record
Recent success of the convolutional neural network in image classification has pushed the computer vision community towards data-rich methods of deep learning. As a consequence of this shift, the data collection process has had to adapt, becoming increasingly automated and efficient to satisfy algorithms that require massive amounts of data. In the push for more data, however, careful consideration into decisions and assumptions in the data collection process have been neglected. Likewise, users accept datasets and their embed- ded assumptions at face-value, employing them in theory and application papers without scrutiny. As a result, undesirable biases, non-consensual data collection, and inappropriate label taxonomies are rife in computer vision datasets. This work aims to explore issues of bias, consent, and label taxonomy in computer vision through novel investigations into widely-used datasets in image classification, face recognition, and facial expression recognition. Through this work, I aim to challenge researchers to reconsider normative data collection and use practices such that computer vision systems can be developed in a more thoughtful and responsible manner.
Cite this version of the work
Chris Dulhanty (2020). Issues in Computer Vision Data Collection: Bias, Consent, and Label Taxonomy. UWSpace. http://hdl.handle.net/10012/16414