A new approach to software defect prediction based on convolutional neural network and bidirectional long short-term memory


  • Károly Nehéz University of Miskolc
  • Nasraldeen Alnor Adam Khleel University of Miskolc




Software defect prediction, Software metrics, Deep learning, Convolutional neural network, Bidirectional Long short-term memory


Software defect prediction (SDP) plays an important role in improving software quality and reliability while reducing software maintenance cost. The problem in the field of SDP is how to determine the defective source code with high accuracy. To build more accurate predictor models, a lot of features are presented, e.g., static code features, social network features, and process features, etc. Several machine learning (ML) and deep learning (DL) algorithms have been developed and adopted to identify and remove defects from the source code, where previous studies have proved that DL algorithms are promising techniques for predicting software defects. The aim of this study is to investigate the prediction performance of two DL algorithms namely, Convolutional Neural Network (CNN) and Bidirectional Long short-term memory (BI-LSTM) in the domain of SDP. To establish the effectiveness of the proposed approach, the experiments were conducted on the available benchmark datasets which obtained from open-source java projects GitHub repository and the models were evaluated by applying seven evaluation metrics which are accuracy, precision, recall, f-measure, matthews correlation coefficient (MCC), area under the ROC curve (AUC), mean square error (MSE). We found out that the best accuracy obtained on training dataset is 81% by using CNN model, while the best accuracy obtained on validation dataset is 80% by using BI-LSTM model. The best AUC obtained on training dataset is 88% by using CNN model, while the best AUC obtained on validation dataset is 83% by using the both models. It is nearly impossible to rule which model is better than the other so every model can be analyzed separately and the best model according to the problem at hand can be used, therefore, based on the problem of this study, The evaluation results show the effectiveness of our proposed models based on standard performance evaluation criteria.