In Silico Model for Lung Cancer Prediction Based on TP53 mutations Using Neural Network

Abstract

In silico models have become well known in the current decade because they assist researchers and specialists in organizing and analyzing big data. To complete their work, these models require powerful techniques and algorithms, the most important of which are machine learning algorithms. This work utilizes the Relief F algorithm for feature selection and trains the back propagation neural network (BPNN) algorithm on the UMD TP53 all-2012-R1-US database for lung cancer. Lung cancer is the most commonly diagnosed cancer among women and men, and can be predicted from mutations that occur in the TP53 tumor suppressor gene. Five measures are used to estimate performance: sensitivity and specificity are important dimensions utilized to obtain the receiver operating characteristic (ROC) curve; accuracy and F measure are necessary to determine algorithm precision; and Matthews correlation coefficient (MCC), which is the most important measure, provides the right criterion for classification algorithms. The Relief F and BPNN algorithms achieve satisfactory results that reach 99.41 for sensitivity, 95.39 for specificity, 99.04 for accuracy, 99.47 for F measure, and 0.93 for MCC.