Show simple item record

dc.contributor.authorHeydari, Shahram 19:42:32 (GMT) 19:42:32 (GMT)
dc.description.abstractIn transportation safety studies, it is often necessary to account for unobserved heterogeneity and multimodality in data. The commonly used standard generalized linear models (e.g., Poisson-gamma models) do not fully address unobserved heterogeneity, assuming unimodal exponential families of distributions. This thesis illustrates how restrictive assumptions (e.g., unimodality) common to most road safety studies can be relaxed employing Bayesian nonparametric Dirichlet process mixture models. We use a truncated Dirichlet process, so that our models reduce to the form of finite mixture (latent class) models, which can be estimated employing standard Markov chain Monte Carlo methods, emphasizing computational simplicity. Interestingly, our approach estimates the number of latent subpopulations as part of its analysis algorithm using an elegant mathematical framework. We use pseudo Bayes factors for model selection, showing how the predictive capability of models can be affected by different assumptions. In univariate settings, we extend standard generalized linear models to a Dirichlet process mixture generalized linear model in which the random intercepts density is modeled nonparametrically, thereby adding flexibility to the model. We examine the performance of the proposed approach using both simulated and real data. We also examine the performance of the proposed model in terms of replicating datasets with high proportions of zero crashes. In terms of engineering insights, we provide a policy example related to the identification of high-crash locations, a critical component of the transportation safety management process. With respect to multilevel settings, this thesis introduces a flexible latent class multilevel model for analyzing crash data that are of hierarchical nature. We extend the standard multilevel model by accounting for unobserved cross-group heterogeneity through multimodal intercepts (group effects). The proposed method allows identifying latent subpopulations (and consequently outliers) at the highest level of the hierarchy (e.g., geographic areas). We evaluate our method on two recent railway grade crossing crash datasets from Canada. This research confirms the need for a multilevel approach for both datasets due to the presence of spatial dependencies among crossings nested within the same region. We provide a novel approach to benchmark different regions based on their safety performance measures. To this end, we identify latent clusters among different regions that share similar unidentified features, stimulating further investigations to explore reasons behind such similarities and dissimilarities. This could have important policy implications for various safety management programs. This thesis also investigates inference for multivariate crash data by introducing two flexible Bayesian multivariate models: a multivariate mixture of points and a mixture of multivariate normal densities. We use a Dirichlet process mixture to keep the dependence structure unconstrained, relaxing the usual homogeneity assumptions. We allow for interdependence between outcomes through a Dirichlet process prior on the random intercepts density. The resulting models collapse into a form of latent class multivariate model, an appealing way to address unobserved heterogeneity in multivariate settings. Therefore, the multivariate models that we derive in this thesis account for correlation among crash types through a heterogeneous correlation structure, which better captures the complex structure of correlated data. To our knowledge, this is the first study to propose and apply such a model in the transportation literature. Using a highway injury-severity dataset, we illustrate how the robustness to homogeneous correlation structures can be examined using a multivariate mixture of points model that relaxes the homogeneity assumption with respect to the location of the dependence structure. We then use the mixture of multivariate normal densities model‒relaxing the homogeneity assumption with respect to both the location and the covariance matrix‒to investigate the effects of various factors on pedestrian and cyclist safety in an urban setting, modeling both outcomes simultaneously. To our knowledge, this is the first study to conduct a joint safety analysis of active modes at an intersection level, a micro-level, which is expected to provide more detailed insights. We show how spurious assumptions affect predictive performance of the multivariate model and the interpretation of the explanatory variables using marginal effects. The results show that our flexible model specification better captures the underlying structure of pedestrian/cyclist crash data, resulting in a more accurate model that contributes to a better understanding of safety correlates of non-motorist road users. This in turn helps decision-makers in selecting more appropriate countermeasures targeting vulnerable road users, promoting the mobility and safety of active modes of transportation.en
dc.publisherUniversity of Waterlooen
dc.subjectTransportation safetyen
dc.subjectBayesian nonparametricsen
dc.subjectDirichlet processen
dc.subjectLatent class modelsen
dc.subjectMultilevel settingsen
dc.subjectMultivariate settingsen
dc.subjectRailway grade crossing safetyen
dc.subjectPedestrian/cyclist safetyen
dc.titleBayesian Nonparametric Dirichlet Process Mixture Modeling in Transportation Safety Studiesen
dc.typeDoctoral Thesisen
dc.pendingfalse and Environmental Engineeringen Engineeringen of Waterlooen
uws-etd.degreeDoctor of Philosophyen
uws.contributor.advisorFu, Liping
uws.contributor.affiliation1Faculty of Engineeringen

Files in this item


This item appears in the following Collection(s)

Show simple item record


University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages