UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Population-level Indicators of Physical Activity, Sedentary Behaviour and Sleep in Canada based on Twitter

dc.contributor.authorNguyen, Olivier
dc.date.accessioned2018-08-17T15:40:27Z
dc.date.available2018-08-17T15:40:27Z
dc.date.issued2018-08-17
dc.date.submitted2018-08-15
dc.description.abstractSocial media platforms contain large amounts of freely and publicly available data that could be used to measure population characteristics across different geographical regions. Analyzing public data sources such as social media data has shown promising results for public health measures and monitoring. This thesis addresses challenges in building sys- tems that collect high-volumes of data from social media platforms. More specifically, we look at Twitter data processing, filtering, and aggregation to provide population-level in- dicators of physical activity, sedentary behavior, and sleep (PASS). In the first part of the thesis, we go over the whole machine learning pipeline built: (i) Twitter data collection from November 2017 to May 2018; (ii) data preparation through manual annotation, key- word filtering, and an active learning technique for the labelling of 10,283 tweets; and (iii) training a classifier to identify PASS related tweets. Training the model involves building an initial classifier to efficiently find relevant tweets in subsequent annotation iterations. Our classifiers include an ensemble model consisting of several shallow machine learning algorithms, along with deep learning algorithms. In the second part of the thesis, we look at the performance of different solutions. We provide benchmark results for the task of classifying PASS related tweets for the various algorithms considered. We also derive health indicators by aggregating and computing the proportion of classified tweets by province and compare our metrics with the prevalence of obesity, diabetes and mood disorders from the Canadian Community Health Survey. Our work shows how machine learning can be used to complement public health data and better inform health policy makers to improve the lives of Canadians.en
dc.identifier.urihttp://hdl.handle.net/10012/13603
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.titlePopulation-level Indicators of Physical Activity, Sedentary Behaviour and Sleep in Canada based on Twitteren
dc.typeMaster Thesisen
uws-etd.degreeMaster of Applied Scienceen
uws-etd.degree.departmentElectrical and Computer Engineeringen
uws-etd.degree.disciplineElectrical and Computer Engineeringen
uws-etd.degree.grantorUniversity of Waterlooen
uws.contributor.advisorCrowley, Mark
uws.contributor.advisorLee, Joon
uws.contributor.affiliation1Faculty of Engineeringen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Nguyen_Olivier.pdf
Size:
805.78 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.08 KB
Format:
Item-specific license agreed upon to submission
Description: