Population-level Indicators of Physical Activity, Sedentary Behaviour and Sleep in Canada based on Twitter
Loading...
Date
2018-08-17
Authors
Nguyen, Olivier
Advisor
Crowley, Mark
Lee, Joon
Lee, Joon
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
Social media platforms contain large amounts of freely and publicly available data that
could be used to measure population characteristics across different geographical regions.
Analyzing public data sources such as social media data has shown promising results for
public health measures and monitoring. This thesis addresses challenges in building sys-
tems that collect high-volumes of data from social media platforms. More specifically, we
look at Twitter data processing, filtering, and aggregation to provide population-level in-
dicators of physical activity, sedentary behavior, and sleep (PASS). In the first part of the
thesis, we go over the whole machine learning pipeline built: (i) Twitter data collection
from November 2017 to May 2018; (ii) data preparation through manual annotation, key-
word filtering, and an active learning technique for the labelling of 10,283 tweets; and (iii)
training a classifier to identify PASS related tweets. Training the model involves building
an initial classifier to efficiently find relevant tweets in subsequent annotation iterations.
Our classifiers include an ensemble model consisting of several shallow machine learning
algorithms, along with deep learning algorithms. In the second part of the thesis, we look
at the performance of different solutions. We provide benchmark results for the task of
classifying PASS related tweets for the various algorithms considered. We also derive health
indicators by aggregating and computing the proportion of classified tweets by province
and compare our metrics with the prevalence of obesity, diabetes and mood disorders from
the Canadian Community Health Survey. Our work shows how machine learning can be
used to complement public health data and better inform health policy makers to improve
the lives of Canadians.