Deep Spatiotemporal Representation of the Face for Automatic Pain Intensity Estimation
Automatic pain intensity assessment has a high value in disease diagnosis applications. Inspired by the fact that many diseases and brain disorders can interrupt normal facial expression formation, we aim to develop a computational model for automatic pain intensity assessment from spontaneous and micro facial variations. For this purpose, we propose a 3D deep architecture for dynamic facial video representation. The proposed model is built by stacking several convolutional modules where each module encompasses a 3D convolution kernel with a fixed temporal depth, several parallel 3D convolutional kernels with different temporal depths, and an average pooling layer. Deploying variable temporal depths in the proposed architecture allows the model to effectively capture a wide range of spatiotemporal variations on the faces. Extensive experiments on the UNBC-McMaster Shoulder Pain Expression Archive database show that our proposed model yields in a promising performance compared to the state-of-the-art in automatic pain intensity estimation.