Recognizing Spontaneous Micro-Expression Using a Three-Stream Convolutional Neural Network

Micro-expression recognition (MER) has attracted much attention with various practical applications, particularly in clinical diagnosis and interrogations. In this paper, we propose a three-stream convolutional neural network (TSCNN) to recognize MEs by learning ME-discriminative features in three key frames of ME videos. We design a dynamic-temporal stream, static-spatial stream, and local-spatial stream module for the TSCNN that respectively attempt to learn and integrate temporal, entire facial region, and facial local region cues in ME videos with the goal of recognizing MEs. In addition, to allow the TSCNN to recognize MEs without using the index values of apex frames, we design a reliable apex frame detection algorithm. Extensive experiments are conducted with five public ME databases: CASME II, SMIC-HS, SAMM, CAS(ME) 2, and CASME. Our proposed TSCNN is shown to achieve more promising recognition results when compared with many other methods.