Natural Language Processing and Text Mining Algorithms for Financial Accounting Information Disclosure

Main Article Content

Huanhuan Shi


This study explores the application of Natural Language Processing (NLP) and Text Mining techniques in analyzing financial accounting information disclosure. Leveraging a diverse corpus of textual data comprising annual reports, regulatory filings, earnings calls transcripts, news articles, and social media posts, the study employs NLP algorithms to extract valuable insights from unstructured textual sources. Key tasks include sentiment analysis, named entity recognition (NER), topic modeling, and machine learning classification. Results indicate a slightly positive sentiment prevailing in the corpus, with variations across document types and industries. High precision, recall, and F1-score metrics are achieved for NER, demonstrating the effectiveness of NLP techniques in accurately identifying entities such as companies, executives, and financial indicators. Topic modeling reveals coherent themes such as financial performance, risk management, and corporate governance within the textual data. Furthermore, machine learning models exhibit strong performance in sentiment analysis and entity recognition tasks, with high accuracy and area under the ROC curve (AUC) scores. Implications for financial decision-making are substantial, with NLP techniques enabling stakeholders to gain deeper insights into market trends, company performance, and regulatory developments. However, challenges remain, including the refinement of NLP models, integration of multimodal data sources, and exploration of ethical and regulatory considerations.

Article Details