Accelerating Drug Safety Assessment using Bidirectional-LSTM for SMILES Data
Main Article Content
Abstract
Computational methods are instrumental in accelerating the pace of drug discovery. Drug discovery involves many steps such as target identification and validation, lead discovery, and lead optimisation etc., During the phase of lead optimisation, the absorption, distribution, metabolism, excretion, and toxicity properties of lead compounds are assessed. To address the issue of predicting toxicity in the lead compounds, the proposed Bi-Directional Long Short-Term Memory (BiLSTM) is a type of Recurrent Neural Network (RNN) that processes input sequences in both forward and backward directions. A Bidirectional LSTM (Long Short-Term Memory) model is applied to analyze the sequences of molecular structures represented by Simplified Molecular Input Line Entry System (SMILES) notation. This approach allows for the comprehensive examination of the structural features of molecules from both forward and backward directions. The model aims to understand the sequential patterns encoded in the SMILES strings, which are then utilised for predicting the toxicity of the molecules. The proposed model on the ClinTox dataset surpasses previous approaches such as Trimnet and Pre-training Graph neural networks (GNN) by achieving an ROC accuracy 0.96.
Article Details
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.