Abstract This study presents attempts to impute modes from data collected via smartphones in Hanoi (Vietnam), where the dominant mode of travel is the motorcycle. The inclusion of the motorcycle… Click to show full abstract
Abstract This study presents attempts to impute modes from data collected via smartphones in Hanoi (Vietnam), where the dominant mode of travel is the motorcycle. The inclusion of the motorcycle mode and an imbalance in the modal share of the Hanoi data resulted in ineffective use of supervised-learning models to detect all modes simultaneously. For a high level of accuracy and reasonable interpretability, a hierarchical process was developed. Initially, walk, bicycle, and motorized modes were identified by a fuzzy logic-based algorithm. Subsequently, based on the distribution of bus stops and the operation of buses in practice, rules employing the average distance between stops, which a vehicle passed slowly or stopped at, were introduced to detect bus segments. Finally, a random forest model was built to distinguish the modes of motorcycle and car. The proposed hierarchical process achieved an accuracy of 89.1%. The bus detection, which required only the coordinates of the bus stops, demonstrated a recall of 87.2%. The motorcycle mode of travel was noted to be the main source of misclassification. This mode has contributed to the diversity of the mode detection field, which has previously only focused on walk, bicycles, cars, buses/trams, and trains. The hierarchy was developed and validated using a dataset that did not include travel by metro or train and would be biased toward persons working and studying at a university. These limitations emphasize the need to test the process on a more diverse sample with more travel options.
               
Click one of the above tabs to view related content.