Utilizing Machine Learning Techniques for Categorizing Cancer Based on Gene Expression Data: A Review

Main Article Content

Begum S., Sandipan Dey, Chakraborty D., Hembrom T., Hazra S., Barman D.

Abstract

Cancer is a group of diseases which share one common feature: the growth of abnormal cells, thus ranking as the second leading cause of death globally, after cardiovascular diseases in WHO's report. Evaluating gene expression is based on the fact that it is the genesis of early cancer detection, the concurrence of the molecular and genetic processes. By using DNA microarray techniques and RNA-sequencing approaches, researchers in computational genomics can give quantitative measures of gene expression levels providing very accurate input data for computational evaluation. The current paper is about machine learning technology, which identifies cancer subtypes according to patterns of gene expression. It embraces the two distinct methodologies which are traditional plus deep learning with high proficiency focused on the cancer-related gene. The outline includes the most popular deep neural network designs such as MLPs, CNN, RNN, GNN, and the recently emerged Transformer networks. The review describes common data collection methods used in this field and some of the essential datasets for supervised machine learning. In addition, the specific techniques developed to cover the complicated horizontal spread of gene expression data are also presented. The article explores theoretical possibilities for the promotion of machine learning-based gene expression analysis in cancer classification towards the end.

Article Details

Section
Articles