Acoustic Aware LLM based Chatbot for Speech-to-Speech Conversations

Main Article Content

Vivek Behl, Vimal Bibhu

Abstract

This work addresses the challenges of speech-to-speech conversation in noisy environments, a critical issue for effective conversational AI systems. Due to inadequate noise handling and echo cancellation abilities, existing models usually struggle with rugged speech recognition and response generation in adverse acoustic circumstances. This research introduces a refined auditive model that blends a further profound learning framework to boost speech recognition in a demanding atmosphere. Our conversational AI agent is equipped to decipher and examine complicated nuances present in voice that have been transformed into text by employing the transformer architecture with 2.13 billion parameters. Our purpose is to determine the appropriate optimizer for acquiring the most elevated rank of responsiveness and conversational accuracy. We enhance the learning method by investigating the usefulness of diverse optimizers named Adam, SGD, and RMSProp. Our investigation about the proffered model greatly outperforms the cutting-edge acoustic echo cancellation models in terms of ERLE and PESQ. The merging of an acoustic-aware framework not only improves the transparency and naturalness of the AI-generated responses but also enhances speech recognition accuracy by eclectic noise conditions. It also upgrades the efficiency and trustworthy conversational AI systems. 

Article Details

Section
Articles