The increase in the amount of data collected in the transport domain can greatly benefit mobility studies and create high value-added mobility information for passengers, data analysts, and transport operators.… Click to show full abstract
The increase in the amount of data collected in the transport domain can greatly benefit mobility studies and create high value-added mobility information for passengers, data analysts, and transport operators. This work concerns the detection of the impact of disturbances on a transport network. It aims, from smart card data analysis, to finely quantify the impacts of known disturbances on the transportation network usage and to reveal unexplained statistical anomalies that may be related to unknown disturbances. The mobility data studied take the form of a multivariate time series evolving in a dynamic environment with additional contextual attributes. The research mainly focuses on contextual anomaly detection using machine learning models. Our main goal is to build a robust anomaly score to highlight statistical anomalies (contextual extremums), considering the variability within the time series induced by the dynamic context. The robust anomaly score is built from normalized forecasting residuals. The normalization of the residuals is carried out using the estimated contextual variance. Indeed, there are complex dynamics on both the mean and the variance in the ridership time series induced by the flexible transportation schedule, the variability in transport demand, and contextual factors such as the station location and the calendar information. Therefore, they should be considered by the anomaly detection approach to obtain a reliable anomaly score. We investigate several prediction models (including an LSTM encoder–decoder of the recurrent neural network deep learning family) and several variance estimators obtained through dedicated models or extracted from prediction models. The proposed approaches are evaluated on synthetic data and real data from the smart card riderships of the Quebec Metro network. It includes a basis of events and disturbances that have impacted the transport network. The experiments show the relevance of variance normalization on prediction residuals to build a robust anomaly score under a dynamic context.
               
Click one of the above tabs to view related content.