A Comparative Study of Khasi Speech Recognition Systems with Recurrent Neural Network-Based Language Model

Main Article Content

S. Deepajothi, Vuda Sreenivasa Rao, C Ambhika, Vishwanadham Mandala, R V V N Bheema Rao, Shailendra Kumar, Venkateswara Rao Gera, D Nagaraju

Abstract

This paper offers a comparative analysis of Khasi speech recognition systems utilizing a recurrent neural network-based language model (RNN-LM). Develop different acoustic models (AMs) to evaluate the optimal performance. This paper observed that using RNN-LM performed best than traditional other models. The wave surfer performs data processing followed by collecting the recorder based continuous speech database. Moreover, a minimization of word error rate (WER) in 2.83.8% range for major speech data and 2.4-3.5% for minor speech data. Additionally, two acoustic features are used, and from the experimental results, the Mel frequency cepstral coefficient (MFCC) yielded improved performance than the perceptual linear prediction (PLP).

Article Details

Section
Articles