Data-driven Train Delay Prediction
Author
Summary, in English
The advent of advanced technologies facilitating the collection and storage of extensive train operation data has paved the way for addressing train delay issues from a data-driven perspective, thus leading to a predominant focus on train delay prediction research. To develop theoretical and practical knowledge for the continuous advancement of decision support tools, this thesis aims to explore and understand data-driven train delay prediction. The thesis is grounded in the findings of six papers. Paper 1 systematically reviews existing literature on data-driven approaches for predicting train delays, captures commonly adopted technical solutions, and identifies weaknesses in current models. It suggests promising directions for future research in this area while highlighting under-researched prediction issues. To ascertain useful input variables, Papers 2 and 3 employ statistical regression to quantify the relationship between various explanatory variables and train delays. Papers 4 and 5 address the development of robust data-driven train delay prediction models, introducing dynamic multi-output models capable of continuously predicting train arrival delays for multiple downstream stations at arbitrary prediction times. To enhance performance, the studies further introduce error adjustment strategies that continuously correct predictions based on observed train traffic information. To ensure real-world effectiveness, Paper 6 seeks to construct an evaluation framework for a thorough assessment of train delay prediction models.
The main contribution of the thesis is twofold. Firstly, it sheds light on the current practices in data-driven train delay prediction studies, synthesising progress in various aspects of model development and highlighting the limitations of existing modelling techniques. Secondly, the thesis introduces innovative approaches to enhance model performance. For example, it identifies limitations in current evaluation processes and introduces an evaluation framework to address these gaps. Recognizing the limitations of the current focus on one-step-ahead prediction for practical application, the thesis introduces a dynamic multi-output modelling framework that generates predictions for all downstream stations at arbitrary times. Overall, the thesis helps to bring greater transparency to this growing field of research, with the ultimate goal of accelerating the adoption of data-driven approaches in the railway research community.
Department/s
Publishing year
2024-05-08
Language
English
Full text
Document type
Dissertation
Publisher
Lund University Faculty of Engineering, Technology and Society, Transport and Roads, Lund, Sweden
Topic
- Transport Systems and Logistics
Keywords
- Machine learning
- Predictive models
- Data-driven
- Railways
- Train Delays
Status
Published
Project
- Utvärdering av ankomstprognoser för tåg
Research group
- Railway Operation
ISBN/ISSN/Other
- ISBN: 978-91-8039-970-8
- ISBN: 978-91-8039-971-5
Defence date
8 May 2024
Defence time
14:00
Defence place
Lecture Hall V:A, building V, John Ericssons väg 1, Faculty of Engineering LTH, Lund University, Lund.
Opponent
- Ronghui Liu (Prof.)