In recent years, object recognition for marine persistent surveillance has attracted much attention from researchers. The surveillance videos are captured with using optical sensors in shallow-deep water where marine vehicles… Click to show full abstract
In recent years, object recognition for marine persistent surveillance has attracted much attention from researchers. The surveillance videos are captured with using optical sensors in shallow-deep water where marine vehicles are moving, e.g., ships, submarines, and underwater unmanned vehicles (UUVs). Access to efficient videos for maritime control and marine transportation system (MTS), particularly underwater optical videos, is still a critical demand for underwater tourism industry. As there are serious problems in underwater optical video imaging systems such as light scattering and absorption in water, low contrast, fog and haziness, detecting various underwater objects will be a difficult task. Moreover, a reliable and safe MTS with the capability of detecting and recognizing underwater objects is required to reduce the costs of movements. The current study addresses predictive challenges and evaluates many video frames with poor qualities of various aquatic species as a real dataset. Intelligent processing of vision computing-related extracted features is performed with the aid of a modified convolutional neural network (CNN) architecture for underwater IoT platforms to support MTS. The decision-making structure in the proposed method is based on a modified residual neural network (mResNet) to recognize trained underwater objects through video frames. The mResNet approach uses the dilation feature to extract the most information from the video frames. Moreover, features extracted from the ResNet structure are designed using traditional learning approaches including feature selection and classification to identify the undersea objects. Also, an optimization process is employed to find the best kernel parameters of the support vector machine (SVM) classifier used in the proposed structure. Our experimental results indicate that the proposed model has a classification accuracy of 93.51% in real videos.
               
Click one of the above tabs to view related content.