Facial landmarks are crucial information needed in numerous facial analysis applications which can help to resolve difficult computer vision-related problems. The localization of landmarks, which involves facial keypoints such as… Click to show full abstract
Facial landmarks are crucial information needed in numerous facial analysis applications which can help to resolve difficult computer vision-related problems. The localization of landmarks, which involves facial keypoints such as eye centers, eyebrows, nose center, etc, offers necessary information for face analysis like expressions, emotions, health conditions, etc. The applications with requirement constraints such as the model size and computational load are often scaled up with better accuracy and efficiency. In this paper, we propose a deep learning-based approach for facial landmarks localization with compound dimension scaling. We modify the baseline network called EfficientNet with multi-scale fully connected layers to predict the facial landmarks on human faces which are mapped on the detected face in real-time. The proposed model with the compound scaling method gives a scalable model by uniformly scaling the width, depth and resolution dimensions. The model is evaluated with an adaptive wing loss function for both larger and smaller models. We also assessed the robustness of the model with various head poses and occlusion conditions. The proposed model which is trained with a large dataset can achieve 90% of accuracy for a larger model with a model size of 24.6 MB and approximately 88%~89% of accuracy for smaller models. Hence, the smaller models can still achieve acceptable accuracy compared to the larger model with fewer parameters.
               
Click one of the above tabs to view related content.