International Journal of Scientific & Technology Research

Home About Us Scope Editorial Board Blog/Latest News Contact Us
10th percentile
Powered by  Scopus
Scopus coverage:
Nov 2018 to May 2020


IJSTR >> Volume 9 - Issue 1, January 2020 Edition

International Journal of Scientific & Technology Research  
International Journal of Scientific & Technology Research

Website: http://www.ijstr.org

ISSN 2277-8616

Model Improvement Through Comprehensive Preprocessing For Loan Default Prediction

[Full Text]



Ahmad Alqerem, Ghazi Alnaymat, Mays Alhasan



Classification, Pre-processing, Prediction, Features selection, Generic algorithm, PSO algorithm, Naïve Bayes, decision tree, SVM, Random Forest.



for financial institutions and the banking industry, it is very crucial to have predictive models for their financial activities, as they play a major role in risk management. Predicting loan default is one of the critical issues that they focus on, as huge revenue loss could be prevented by predicting customer’s ability to pay back on time. In this paper, different classification methods (Naïve Bayes, Decision Tree and Random Forest) are being used for prediction, comprehensive different pre-processing techniques are being applied on the data set, and three different feature extractions algorithms are being used to enhance accuracy and performance. Results are compared using F1 accuracy measure, and improvement was over 3%.



[1] Gaurav Akrani., Kaylan City Life (20-Apr-2011), Available: http://kalyan-city.blogspot.com/2011/04/functions-of-banks-important-banking.html. [Accessed: 1- Jan- 2019]
[2] Businessmodelinnovationmatters (24-Apr-2012), Available: https://businessmodelinnovationmatters.wordpress.com/2012/03/24/understanding-the-business-model-of-a-bank/.[Accessed: 1- Jan- 2019]
[3] E. Angelini, A. Roli, and G. di Tollo, “A neural network approach for credit risk evaluation”, The Quarterly Review of Economics and Finance, vol. 48, pp. 733–755, 2008.
[4] Chun F. Hsu and H. F. Hung, “Classification Methods of Credit Rating - A Comparative Analysis on SVM, MDA and RST”, International Conference on Computational Intelligence and Software Engineering, pp. 1–4, 2009.
[5] Amira Hassan and Ajith Abraham, “Modeling Consumer Loan Default Prediction Using Ensemble Neural Networks”, International Coference on Computing, Electrical and Electronic Engineering (ICCEEE), pp. 719 – 724, 2013.
[6] M.V. Jagannatha Reddy and B. Kavitha, “Neural Networks for Prediction of Loan Default Using Attribute Relevance Analysis”, International Conference on Signal Acquisition and Processing, pp. 274 – 277, 2010.
[7] Yu Jin and Yudan Zhu, “A Data-Driven Approach to Predict Default Risk of Loan for Online Peer-to-Peer (P2P) Lending”, Fifth International Conference on Communication Systems and Network Technologies, pp. 609 – 613, 2015.
[8] Archana Gahlaut, Tushar and Prince Kumar Singh, “Prediction analysis of risky credit using Data mining classification models”, 28th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1-7, 2017.
[9] Li Xiang-wei and Qi Yian-fang, “A Data Preprocessing Algorithm for Classification Model Based On Rough Sets”, 2012 International Conference on Solid State Devices and Materials Science, pp. 25-29, 2012.
[10] Kalyan Netti and Y Radhika, “A novel method for minimizing loss of accuracy in Naive Bayes classifier”, IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pp. 1-4, 2015.
[11] Z. Xiaoliang et al., “Research and Application ofthe improved Algorithm C4.5 on Decision Tree”, International Conference on Test and Measurement, pp. 184 – 187, 2009.
[12] Afsaneh Mahanipour and Hossein Nezamabadi-pour, “Improved PSO-based feature construction algorithm using Feature Selection Methods”, 2nd Conference on Swarm Intelligence and Evolutionary Computation (CSIEC), pp. 1-5, 2017.
[13] Raul Eulogio, ORACLE + Data Science (2017, Aug, 12), Available: https://www.datascience.com/resources/notebooks/random-forest-intro. [Accessed: 2- Jan- 2019]
[14] M. Bentlemsan et al., “Random Forest and Filter Bank Common Spatial Patterns for EEG-Based Motor Imagery Classification”, th International Conference on Intelligent Systems, Modelling and Simulation, pp. 235 – 238, 2014.
[15] Chioka (2013, Aug, 30), Available: http://www.chioka.in/class-imbalance-problem/. [Accessed: 2- Jan- 2019]
[16] Hong Zhang, Yong-gong Ren and Xue Yang, “Research on Text Feature Selection Algorithm Based on Information Gain and Feature Relation Tree”, 10th Web Information System and Application Conference, pp. 446 – 449, 2013.
[17] Ho-duck Kim et al., “Genetic Algorithm Based Feature Selection Method Development for Pattern Recognition”, SICE-ICASE International Joint Conference, pp. 1020 – 1025, 2006.