In this work, a novel hybrid neural network with temporal attention (HNNTA) is proposed for inertial pedestrian localization. The HNNTA model employs the convolutional neural network (CNN) for extracting sectional… Click to show full abstract
In this work, a novel hybrid neural network with temporal attention (HNNTA) is proposed for inertial pedestrian localization. The HNNTA model employs the convolutional neural network (CNN) for extracting sectional features from the IMU data, followed by the long short-term memory (LSTM) network to capture the global temporal information. A temporal attention mechanism is designed to weigh the hidden states produced by the LSTM network and generate the final features for velocity prediction. Specifically, the proposed temporal attention mechanism is composed of the CNN feature refinement module and the sigmoid score normalization function. We utilize different 1-D filters to refine the temporal hidden states from previous refined time indexes and form the value matrix with each row containing different features along with the entire window time indexes and each column representing individual features from the same time spans. We then employ the sigmoid function to normalize the dot-product alignment between features from different time spans and that of the last refined time index. We employ the RoNIN dataset to evaluate the HNNTA model, which contains the largest and most natural IMU measurements. We employ extensive erosion experiments to show the effectiveness of the HNNTA model design. Compared with the state-of-the-art method, the HNNTA model provides 10.39% higher 50th percentile accuracy for all phone carriers that have been seen in the training set and 8.69% higher for those that have not been seen. The real-world experiments with IMU measurements collected on the CUHK campus further demonstrate the better generalization capability of the HNNTA model.
               
Click one of the above tabs to view related content.