Lessan, JavadFu, LipingWen, Chao2020-02-052020-02-052019-01https://doi.org/10.1016/j.cie.2018.03.017http://hdl.handle.net/10012/15617The final publication is available at Elsevier via https://doi.org/10.1016/j.jedc.2018.11.005. © 2018 This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/We present a Bayesian network-(BN) based train delay prediction model to tackle the complexity and dependency nature of train operations. Three different BN schemes, namely, heuristic hill-climbing, primitive linear and hybrid structure, are investigated using real-world train operation data from a high-speed railway line. We first use historical data to rationalize the dependency graph of the developed structures. Each BN structure is then trained with the gold standard k-fold cross validation approach to avoid over-fitting and evaluate its performance against the others. Overall, the validation results indicate that a BN-based model can be an efficient tool for capturing superposition and interaction effects of train delays. However, a well-designed hybrid BN structure, developed based on domain knowledge and judgments of expertise and local authorities, can outperform the other models. We present a performance comparison of the predictions obtained from the hybrid BN structure against the real-world benchmark data. The results show that the proposed model on overage can achieve over 80% accuracy in predictions within a 60-min horizon, yielding low prediction errors regarding mean absolute error (MAE), mean error (ME) and root mean square error (RMSE) measures.enAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/high-speed railtrain operationpunctualityBayesian networksdelay predictionperformance evaluationA hybrid Bayesian network model for predicting delays in train operationsArticle