A Comprehensive Multimodal Approach to Assessing Sentimental Intensity and Subjectivity using unified MSE model

Main Article Content

Mohd Usman Khan, Faiyaz Ahamad

Abstract

In the dynamic realm of multimodal learning, where Representation Learning serves as a pivotal key, our research introduces a groundbreaking approach to understanding sentiment and subjectivity in audio and text. illustration from self-supervised learning, we've innovatively combined multi-modal and Unified--modal tasks, emphasizing the crucial aspects of consistency and distinctiveness. Our training techniques, likened to the art of fine-tuning an instrument, harmonize the learning process, prioritizing samples with distinctive supervisions. Addressing the pressing need for robust datasets and methodologies in combinational text and audio sentiment analysis, we present the Multimodal Opinion-level Sentiment Intensity dataset (MOSI). This meticulously annotated corpus offers insights into subjectivity, sentiment intensity, text features, and audio nuances, setting a benchmark for future research. Our method not only excels in generating Unified-modal supervisions but also stands resilient against benchmarks like MOSI and MOSEI, even rivaling human-curated annotations on the challenging datasets. This pioneering work paves the way for deeper explorations and applications in the burgeoning field of sentiment analysis.

Article Details

Section
Articles