Video Classification Using Deep Autoencoder Network
We present a deep learning framework for video classification applicable to face recognition and dynamic texture recognition. A Deep Autoencoder Network Template (DANT) is designed whose weights are initialized by conducting unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines. In order to obtain a class specific network and fine tune the weights for each class, the pre-initialized DANT is trained for each class of video sequences, separately. A majority voting technique based on the reconstruction error is employed for the classification task. The extensive evaluation and comparisons with state-of-the-art approaches on Honda/UCSD, DynTex, and YUPPEN databases demonstrate that the proposed method significantly improves the performance of dynamic texture classification.