A Machine Learning Approach to Improving the Accuracy of Similarity Evaluation of DNA Genetic Codes using Proposed Longest Common Subsequence Algorithm
Main Article Content
Abstract
DNA sequence analysis in biological and computational applications, such as studying evolution discovering genes and diagnosing genetic diseases. It is crucial to identify similarities between codes in these fields. Traditional methods like the subsequence (LCS) algorithm have been widely used to compare DNA sequences. However, this research paper introduces an approach that utilizes machine learning techniques to address the challenges of evaluating DNA sequence similarity. The proposed algorithm combines the effectiveness of sequence alignment with the power of data driven models. By leveraging a trained machine learning model, it predicts alignment scores reducing burden while maintaining high accuracy. Using NCBI GenBank nucleotide sequence dataset and Proposed LCS Algorithm implement and deployed with Support Vector Machines. The Algorithm is randomly tested with 10000 samples NCBI GenBank nucleotide sequence dataset and among the different samples. The result of the SVM Classification Algorithm and it computes the sequence similarity between HUMAN to ANIMAL DNA Sequence among them Human and Chimpanzee shows the best result with the prediction of 98%.
Article Details

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.