Paper Information

Speech Emotion Recognition Using Deep Neural Networks

Paper ID: 2442

Salam Ullah

Qayum Ahmad Sahib

Faiz Ullah

Sajad Ullah

Izaz Ul Haq

Ibad Ullah


Emotion recognition in speech has received a lot of attention in the field of psychology and cognitive science, but in recent generations, data science has given Speech emotion recognition (SER) a lot of growth with a particularly active and motivating aspect of human-machine interaction in speech communication. It can be used in an automatic remote call center, a cardboard system, the field of E-learning, and the emotions of students during the lecture. To obtain sentiments from audio signals, researchers have been developing a handful of well-known approaches for voice computations and classification methods. In this paper, a broad sense of overview of SER has been developed using deep learning techniques such as audio signal preprocessing, feature extraction, and selection methods and finally determining the accuracy of the appropriate classifier. The emotional datasets Ravdess, Crema-D, Tess, and Savee are concatenated and were used to train the onedimensional Convolutional Neural Network (CNN). On concatenated datasets of Ravdess, Crema-D, Tess, and Savee, the feature combination of ZCR+energy+entropy of energy+RMS+MFCC has demonstrated the accurateness of 92.62%