This study proposes a lightweight convolutional neural network with an octave-like convolution mixed block, called OLCMNet, for detecting driver distraction under a limited computational budget. The OLCM block uses point-wise… Click to show full abstract
This study proposes a lightweight convolutional neural network with an octave-like convolution mixed block, called OLCMNet, for detecting driver distraction under a limited computational budget. The OLCM block uses point-wise convolution (PC) to expand feature maps into two sets of branches. In the low-frequency branches, we perform average pooling, depth-wise convolution (DC), and upsampling to obtain a low-resolution low-frequency feature map, reducing spatial redundancy and connection density. In the high-frequency branches, the expanded feature map with the original resolution is fed to the DC operator, gaining an apposite receptive field to capture fine details. The feature concatenation of the low-frequency and high-frequency branches is encoded sequentially by a squeeze-and-excitation (SE) module and PC operator, realizing feature global information fusion. Introducing another SE module at the last stage, the OLCMNet facilitates further sensitive information exchange between layers. In addition, with an augmented reality head-up display (ARHUD) platform, we create a Lilong Distracted Driving Behavior (LDDB) Dataset through a series of on-road experiments. Such a dataset contains 14808 videos collected from an infrared camera, covering six driving behaviors of 2468 participants. We manually annotate these videos at five frames per second, obtaining a total of 267378 images. Compared with the existing methods, the embedded hardware platform experiments indicate that OLCMNet hits acceptable trade-offs, namely, 89.53% accuracy for StateFarm Dataset and 95.98% accuracy LDDB Dataset when the latency is 32.8 ± 4.6ms.
               
Click one of the above tabs to view related content.