Classification and the Prediction of Covid-19 by Applying Naive Bayes and Random Forest Algorithms

Main Article Content

K. S. Padmashree, P. Velmani, S. Loghambal

Abstract

Classification is a supervised learning algorithm in machine learning that categorizes input data into specific labels according to its features. The primary objective of classification is to develop a model that can reliably predict the appropriate label or category to previously unseen data. COVID-19, a highly contagious and severe disease caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), is believed to have originated in bats and transmitted to humans via an unidentified intermediary in Wuhan, China, in late December 2019. This disease can lead to significant organ dysfunction, impacting essential organs such as the heart, liver, and kidneys, as well as disrupting the normal functioning of organ systems, including the cardiovascular and immune systems. The focus of this study aims to classify and predict COVID-19 outcomes using two machine learning algorithms: Naïve Bayes (NB) and Random Forest (RF). The research utilizes the COVID_Data.CSV dataset, which contains 316,800 data points. Of these, 70% are utilized for training the models, while the remaining 30% is allocated for testing. The Naïve Bayes classifier gets an accuracy of 87.39%, while the Random Forest classifier achieves a slightly higher accuracy of 87.47%. A comparative analysis reveals that the Random Forest classifier outperforms Naïve Bayes, establishing it as the more effective model for the classification and prediction of Covid-19 utilizing Machine Learning (ML) techniques.

Article Details

Section
Articles