Machine Learning Approach and Adolescent Health Implementation of Machine Learning Algorithms to Explore Adolescent Health, BMI and Weight Perception Using COMPASS Study

Loading...
Thumbnail Image

Date

2024-09-24

Advisor

Leatherdale, Scott
Chen, Helen

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Abstract Introduction Adolescence is critical for behaviour formation and physical, social, and psychological changes. The risk related to unhealthy behaviours occurring during this stage can emerge in early adulthood and could lead to adverse outcomes. The COVID-19 pandemic has significantly impacted the youth’s characteristics and the school’s learning environment. Exploring the profile of Canadian youth before and after COVID-19 helps understand the factors that influence student engagement in the multi-domain setting, including school activities, mental health, healthy eating, etc. Among the various aspects of student profile, childhood obesity and weight perceptions are two important topics for Canadian youth's health and well-being. Obesity in children is a complex public health concern in Canada. Almost 1 in 7 children are considered obese, and 75% of obese children are obese in their adulthood when they grow up. Additionally, a youth’s perception of their weight is often connected to their BMI status. Studies have shown that adolescents who are overweight and obese tend to underestimate their true weight, and the ones with normal weight could overestimate their weight status. Both overweight and obese or misperception of body weight are risk factors for developing non-communicable diseases in adulthood and can lead to worse mental health issues, such as lower self-esteem. Therefore, observing these two aspects and related impactful factors is necessary for informing youth obesity preventive intervention. Compared to traditional statistics, machine learning has proven effective in handling the complex relationships between multidomain variables and in detecting nonlinear relationships between the predictor variables and the target outcome. This thesis explores behavioural patterns, specifically BMI and weight perception, among the Canadian youth participating in the COMPASS study. Objective and Methods The dissertation utilized Wave 7 (2018-19) and Wave 9 (2020-21), and linked data from Wave 9 to 11 (2020-23) of the COMPASS study. This ongoing survey-based cohort study includes grade 9-12 students attending secondary schools in Alberta, British Columbia, Ontario, and Quebec in Canada. The survey questions cover multiple aspects of student health behaviours, including eating, sedentary behaviour, alcohol, tobacco, and marijuana use, bullying, academic performance, physical activity, BMI, and school connectedness. Study 1 applied k-means clustering analysis to establish student behaviour profiles, followed by a Random Forest model (RF) to identify the factors associated with the characteristics before and after COVID-19. Among the various student-level behaviour factors, Study 2 focused on self-reported BMI by predicting youth BMI status and identified associated factors using six supervised machine learning classifiers, K-Nearest Neighbour, Logistic Regression, Support Vector Machine, Random Forest, Multinomial Naïve Bayes, and Extreme Gradient Boosting (XGBoost) between Complete Case Analysis (CCA) and multiple imputation (MI) approaches. Since weight perception was identified as the most significant factor associated with BMI prediction in Study 2, Study 3 used weight perception as the outcome variable. This study explored the transition pattern of student weight perception and the associated factors through the Markov Chain, Multinomial Logistic Regression, and time-series deep learning models, including Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM). The Shapley Additive Explanations (SHAP) analysis was applied for all three studies to assess the model interpretability. Results Study 1 found that in both pre-COVID (2018-19) and post-COVID (2020-21) cohorts, clusters were distinguished as closely involved with substance use and lower psychological well-being, including personal relationships with their families or unwillingness to discuss their mental health with an adult at school. However, the number of students in these clusters was relatively low compared to the other clusters, comprising only 0.5% of the pre-COVID cohort and 0.2% of the post-COVID cohort. In sub-clusters from the pre-COVID cohort that excluded substance use factors, students who skipped more than 20 classes, did not complete homework, and were involved in bullying were more likely to be grouped together in one sub-cluster compared to students without these behaviours. Additionally, those intending to lose weight, perceiving themselves as overweight,and being less active had higher odds of being in one sub-cluster compared to those without weight change intentions and who perceived their weight as about the right. In the sub-clusters of the post-COVID cohort, students with adverse mental health well-being, higher anxiety scores, self-reported overweight or obese BMI, and those failing to meet the guidelines of at least 60 minutes of physical activity per day were more likely to be grouped together in one sub-cluster compared to the students with the opposite behaviours. Moreover, students with increased cannabis use, less physical activity, lower flourishing scores and higher anxiety scores were more likely to be in one group relative to students who reported no change in cannabis use. Study 2 discovered that, among both CCA and MI approaches, XGBoost achieved the best performance in predicting BMI compared to those using Support Vector Machine (SVM) and Random Forest (RF). In CCA, it achieved an overall accuracy and ROC-AUC value of 0.64 and 0.78. The model performance was similar in MI and yielded an overall accuracy of 0.64 and ROC-AUC value of 0.79. Based on the SHAP summary plot, the most important predictors were weight perception, gender, and weight change intention in all three BMI classes, and the result was consistent in both CCA and MI. According to the SHAP summary plot, the most significant predictors in all three BMI classes were further explained by weight perception, gender and weight change intention. Students who perceived themselves as slightly or very overweight and were trying to lose weight were classified into the overweight and obese class, while model prediction on health weight BMI involved perceiving the right weight and not trying to change their weight. Additionally, gender was emerged as the most significant variable in the unknown BMI class, with model predicting that females and gender minorities were more likely to be categorized in the unknown weight class compared to male students. Study 3 used the MI approaches on the linked dataset. It found that although there was a moderate probability of transitioning from underweight or overweight to about the right weight and a minor probability of changing from underweight to overweight, most youths were more likely to stay in the same weight perception as the previous year. In addition, Study 3 identified predictors associated with weight perception transition over time. Students with underweight BMI, the intention of gaining weight and disagreement with discussing their problems with families in Year 1 had a higher likelihood of perceiving themselves as underweight in Year 2. While overweight and obese or unknown BMI students who had the intention of losing weight and decreased time on social media after COVID-19 in Year 1 were more likely to have an overweight perception in Year 2. Compared to peers who stayed at perceiving about the right weight in Year 2, underweight BMI students had higher odds of remaining underweight perception. In contrast, overweight BMI students with losing weight intention were likely to continue with an overweight perception and transition from about the right weight to an overweight perception. Students who wanted to gain weight were more likely to remain underweight or change from underweight to about the right weight. Notably, students with more participation in muscle-strengthening exercises had a lower likelihood of remaining overweight perception. Both deep learning models have shown the top significant variables in Year 1 and Year 2 were weight change intention, BMI, skipped breakfast for losing weight, and days in muscle training exercises predicting third-year weight perceptions. The intention of losing weight, higher BMI status such as overweight and obese or unknown, and more days in skipping breakfast for losing weight predicted students towards an overweight perception. In contrast, among the right weight perception group, students tended to stay the same weight or not try to do any weight change, had healthy BMI, had a lower number of days skipping breakfast, and had more days in muscle training. These results were consistent with the weight perception transition-associated predictors from the Multinomial logistic regression. Conclusion As the first study that focused on Canadian youth behaviour profiles, prediction of BMI and weight perception incorporating multiple machine learning techniques, the result of the dissertation disseminates key messages to the stakeholders who want to understand student behaviour profiles and focus on the preventive intervention on youth obesity. This research emphasized the importance of the school’s environment on youth’s behaviour and the tri-directional relationships among BMI, weight change intention, and weight perception, suggesting future analysis to emphasize these variables when developing healthy behaviour-related strategies. Overall, it illustrated the necessity of establishing educational programs related to BMI and weight perception at school to raise awareness of self-esteem and body image acceptance.

Description

Keywords

adolescent, weight perception, BMI, machine learning

LC Subject Headings

Citation