Application of Transfer Learning Based on Deep Learning in Emotion Recognition and Analysis
Main Article Content
Abstract
In the field of human-computer interaction, the development of speech emotion recognition technology has been continuously deepening, especially the increasingly widespread application of deep learning techniques in emotion analysis. To achieve more accurate emotion recognition, a speech emotion recognition model based on deep learning and transfer learning has been researched and constructed, integrating convolutional neural networks (CNNs) with the random forest algorithm. Experiments were conducted on multiple publicly available datasets. The experimental results indicate that within the range of 20 to 400 iterations, the recall rate and F1 score of the CNN-RF model showed a continuous upward trend. In the early iterations, the F1 score of CNN-RF reached 0.81, significantly outperforming other reference algorithms such as support vector machines and decision trees. After 400 iterations, its recall rate and F1 score increased to 0.93 and 0.94, respectively, further validating the effectiveness of the model in extracting key emotional features. In contrast, the performance growth rate of other algorithms was slower, demonstrating the stability and robustness of CNN-RF during long-term learning. Additionally, by combining random forests, the model is able to handle more complex and diverse data features, which was evident in the experiments. Specifically, on the USC-FaMocap dataset, the model achieved nearly perfect accuracy, highlighting its exceptional ability in high-dimensional data processing. This study demonstrates the effectiveness of deep learning and transfer learning in emotion recognition, showcasing the significant advantages of the CNN-RF model in complex data processing.
Article Details
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.