Enhanced Speech Emotion Recognition Using the Cognitive Emotion Fusion Network for PTSD Detection with a Novel Hybrid Approach

et al. Chappidi Suneetha

doi:10.52783/jes.644

PDF

Published: Jan 25, 2024

DOI: https://doi.org/10.52783/jes.644

Keywords:

Speech Emotion Recognition, Cognitive Emotion Fusion Network, PTSD Detection, Hybrid Neural Networks, Emotional State Analysis.

Chappidi Suneetha, Raju Anitha

Abstract

In the evolving field of Speech Emotion Recognition (SER), essential for understanding and addressing mental health issues, conventional models often falter in interpreting complex emotional states, particularly those related to mental health conditions like PTSD. This study introduces the Cognitive Emotion Fusion Network (CEFNet), a novel hybrid SER model integrating Improved and Faster Region-based Convolutional Neural Networks (IFR-CNN), Deep Convolutional Neural Networks (DCNNs), Deep Belief Networks (DBNs), and the Bird's Nest Learning Analogy (BNLA). Aimed at surpassing the limitations of traditional models, CEFNet focuses on accurately interpreting nuanced emotional expressions, employing advanced machine learning techniques and comprehensive feature extraction. Evaluated using the EMODB and RAVDESS datasets, CEFNet demonstrated superior performance, achieving an accuracy of 98.11% and 91.17% on these datasets, respectively, outperforming existing models in precision and F1 scores. This research marks a significant contribution to SER, particularly in mental health applications, offering a robust framework for emotion recognition in speech. It opens avenues for future enhancements, including broader applicability across languages and cultural contexts, optimization for resource-limited environments, and integration with other modalities for more holistic emotion recognition.

Issue

Vol. 19 No. 4 (2023)

Section

Articles

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

Author Biography

Chappidi Suneetha, Raju Anitha

¹Chappidi Suneetha

^2*Raju Anitha

^*2 Corresponding author : Associate Professor, Department of Computer Science & Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India, Email: rajuanitha46885@gmail.com

¹Research Scholar, Department of Computer Science & Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India, Email Id: maanash11@gmail.com

References

Wani, T. M., Gunawan, T. S., Qadri, S. A. A., Kartiwi, M., &Ambikairajah, E. (2021). A comprehensive review of speech emotion recognition systems. IEEE access, 9, 47795-47814.

Yehuda, R., Hoge, C. W., McFarlane, A. C., Vermetten, E., Lanius, R. A., Nievergelt, C. M., ... & Hyman, S. E. (2015). Post-traumatic stress disorder. Nature reviews Disease primers, 1(1), 1-22.

Schuller, B., &Batliner, A. (2013). Computational paralinguistics: emotion, affect and personality in speech and language processing. John Wiley & Sons.

Suneetha, C., &Anitha, R. (2022). A Survey Of Machine Learning Techniques OnSpeech Based Emotion Recognition And Post Traumatic Stress DisorderDetection. Neuroquantology, 20(14), 69.

Bhatt, R. (2023). An Analytical Review of Deep Learning Algorithms for Stress Prediction in Teaching Professionals. Innovative Engineering with AI Applications, 23-39.

Hyland Bruno, J., Jarvis, E. D., Liberman, M., &Tchernichovski, O. (2021). Birdsong learning and culture: analogies with human spoken language. Annual review of linguistics, 7, 449-472.

Kwon, S. (2021). Att-Net: Enhanced emotion recognition system using lightweight self-attention module. Applied Soft Computing, 102, 107101.

Mustaqeem, & Kwon, S. (2019). A CNN-assisted enhanced audio signal processing for speech emotion recognition. Sensors, 20(1), 183

Sajjad, M., & Kwon, S. (2020). Clustering-based speech emotion recognition by incorporating learned features and deep BiLSTM. IEEE access, 8, 79861-79875.

Ahmed, M. R., Islam, S., Islam, A. M., &Shatabda, S. (2023). An ensemble 1D-CNN-LSTM-GRU model with data augmentation for speech emotion recognition. Expert Systems with Applications, 218, 119633.

Livingstone, S. R., & Russo, F. A. (2018). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one, 13(5), e0196391.

Nakano, A., &Nagamune, K. (2022). A Development of Robotic Scrub Nurse System-Detection for Surgical Instruments Using Faster Region-Based Convolutional Neural Network–. Journal of Advanced Computational Intelligence and Intelligent Informatics, 26(1), 74-82.

Corujo, L. A., Kieson, E., Schloesser, T., &Gloor, P. A. (2021). Emotion recognition in horses with convolutional neural networks. Future Internet, 13(10), 250.

Seshaiah, M. (2021). Comparative Analysis of Various Face Detection and Tracking and Recognition Mechanisms using Machine and Deep Learning Methods. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(11), 215-223.

G. Deepika, & K. Deepthi Reddy. (2022). Machine Learning Based Emotional Sentiment Analysis of Tweet Data Using a Voting Classifier. International Journal of Computer Engineering in Research Trends, 9(10), 193–200.

Livingstone, S. R., & Russo, F. A. (2018). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PloS one, 13(5), e0196391.

Sato, N., &Obuchi, Y. (2007). Emotion recognition using mel-frequency cepstral coefficients. Information and Media Technologies, 2(3), 835-848.

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.

S, N., N, P., & P, N. (2023). A Study on Flower Classification Using Deep Learning Techniques. International Journal of Computer Engineering in Research Trends, 10(4), 161–166.

Ullah, R., Asif, M., Shah, W. A., Anjam, F., Ullah, I., Khurshaid, T., ...&Alibakhshikenari, M. (2023). Speech Emotion Recognition Using Convolution Neural Networks and Multi-Head Convolutional Transformer. Sensors, 23(13), 6212

Peng, Z., Lu, Y., Pan, S., & Liu, Y. (2021, June). Efficient speech emotion recognition using multi-scale cnn and attention. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 3020-3024). IEEE.

ChappidiSuneetha and RajuAnitha (2023) Speech Based Emotion Recognition By Using a Faster Region-Based Convolutional Neural Network ,Multimedia Tools and Applications(Springer-SCIE) ( Accepted )

ChappidiSuneetha and RajuAnitha (2023) Synergistic Integration of DCNNs and DBNs with Bird's Nest Learning Analogy for Enhanced PTSD Detection from Emotional Speech Data ,Multimedia Tools and Applications(Springer-SCIE) ( Communicated )

Article Sidebar

Main Article Content

Abstract

Article Details

Chappidi Suneetha, Raju Anitha

References