Video-Based Facial Expression Recognition using a Blend of 3D CNN and ConvLSTM

Saurav, S and Kumar, T and Saini, R and Singh, S (2020) Video-Based Facial Expression Recognition using a Blend of 3D CNN and ConvLSTM. In: 17th IEEE India Council International Conference (INDICON-2020), December 11-13, 2020, NSUT, New Delhi, India.

Download (2469Kb) | Preview


The 3-Dimensional Convolutional Neural Network (3D CNN) and Long Short-Term Memory Network (LSTM) have consistently outperformed many approaches in video-based Facial Expression Recognition (VEER). The vanilla version of the fully-connected LSTM (FC-LSTM) unrolls the image to a one-dimensional vector, which results in the loss of vital spatial information. Convolutional LSTM (ConvLSTM) overcomes this limitation by performing LSTM operations in terms of convolutions without performing any unrolling, as in the case with FC-LSTM. Motivated by this, in this paper, we propose a neural network architecture that consists of a blend of 3D CNN and ConvLSTM. The proposed hybrid architecture captures spatial-temporal information to produce competitive accuracy on three publicly available FER databases, namely the CK+, SAVEE, and AFEW. The experimental results demonstrate excellent performance without using any external emotion data with an added advantage of having a simple model with a comparatively fewer number of parameters and model size. Our designed FER pipeline is a suitable candidate for automatic recognition of facial expressions in real-time on a resource-constrained embedded platform.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Video facial expression recognition (VFFE); 3D convolutional neural networks (3D CNN); Long short-term memory (LSTM); Convolutional LSTM (ConvLSTM).
Subjects: Electronic Systems > Digital Systems
Divisions: Electronic Systems
Depositing User: Mr. Jitendra Nath Bajpai
Date Deposited: 14 Sep 2021 07:14
Last Modified: 14 Sep 2021 07:14

Actions (login required)

View Item View Item