Physiological studies have shown that healthy and depressed individuals present different facial changes. Thus, many researchers have attempted to use Convolutional Neural Networks (CNNs) to extract high-level facial dynamic representations… Click to show full abstract
Physiological studies have shown that healthy and depressed individuals present different facial changes. Thus, many researchers have attempted to use Convolutional Neural Networks (CNNs) to extract high-level facial dynamic representations for predicting depression severity. However, the max-pooling (or average-pooling) layers in the CNN lead to the loss of subtle depression cues. Without pooling layers, the CNN cannot extract multi-scale information and has difficulties for tensor vectorization. To this end, we propose a Selective Element and Two Orders Vectorization (SE-TOV) network. For the SE-TOV network, an SE block is constructed to adaptively select the effective elements from the tensors obtained by receptive fields of different sizes. Moreover, we propose a TOV block for vectorizing a high-dimensional tensor. On the one hand, TOV block inputs a tensor into the Global Average Pooling layer to obtain the first-order vectorization result. On the other hand, it takes principal components of the correlation matrix of channels in a tensor as the second-order vectorization result. Experimental results on AVEC 2013 (RMSE
               
Click one of the above tabs to view related content.