The automatic statistics of passengers distribution and movement in some public transportation scenes from online cameras video is significant to public security, passenger flow control, and guidance. However, it is… Click to show full abstract
The automatic statistics of passengers distribution and movement in some public transportation scenes from online cameras video is significant to public security, passenger flow control, and guidance. However, it is hard to get the crowd counting results accurately in carriages or platforms due to the narrow space, complex background, severe perspective distortion, and scale variation. A new crowd count method called coordinate deformable net (CDNet) is proposed in this article. This method is comprised of four sections: the front end, coordinate attention module (CAM), deformable scale-aware module (DSM), and back end. The front end is a preliminary feature extractor with the structure of VGG-16 but does not contain the full connection layer. The back end is used for density map prediction with the construction of six stacked dilated convolution layers. The new add-in CAM module ensures that the model has more capabilities of resisting complex background noise and focusing on areas of interest that are crucial for generating high-quality density maps. In the new DSM module, multicolumn deformable convolution can extract multiscale features effectively and fuse them according to attention weights. The DSM module can improve the scale variation problem significantly to adapt different scale passengers’ detection and enhance the accuracy of head positioning. The experimental results demonstrate that the accuracy of CDNet is superior to other classical models. In addition, to protect personal privacy, the faces in our Tram dataset are blurred.
               
Click one of the above tabs to view related content.