A Methodology for Speaker Diazaration System Based on LSTM and MFCC Coefficients

Main Article Content

Indu D., Y. Srinivas

Abstract

Research on Speaker Identification is always difficult. A speaker may be automatically identified using  by comparing their voice sample with their previously recorded voice, the machine learning strategy has grown in favor in recent years. Convolutional neural networks (CNN) , deep neural networks (DNN)  are some of the machine learning techniques that  has employed recently. The article will discuss a successful speaker verification system based on the d-vector to construct a new approach based on speaker diarization. In particular, in this article, we use the concept of LSTM to cluster the speech segments using MFCC coefficients and identify the speakers in the diarization system. The proposed system will be evaluated using benchmark performance metrics, and a comparative study will be made with other models. The need to consider the LSTM neural network using acoustic data and linguistic dialect is considered. LSTM networks could produce reliable speaker segmentation outputs.

Article Details

Section
Articles