Change search
ReferencesLink to record
Permanent link

Direct link
Performance of Three Classification Techniques in Classifying Credit Applications Into Good Loans and Bad Loans: A Comparison
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Social Sciences, Department of Statistics.
2015 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The use of statistical classification techniques in classifying loan applications into good loans and bad loans gained importance with the exponential increase in the demand for credit. It is paramount to use a classification technique with a high predictive capacity to ensure the profitability of the business venture.


In this study we aim to compare the predictive capability of three classification techniques: 1) Logistic regression, 2) CART, and 3) random forests. We apply these techniques on German credit data using an 80:20 learning:test split, and compare the performance of the models fitted using the three classification techniques. The probability of default pi for each observation in the test set is calculated using the models fitted on the training dataset. Each test set sample xi is then classified into a good loan or a bad loan, based on a threshold , such that xi bad loan class if pi  . We chose several  thresholds in order to compare the performance of each of the three classification techniques on five model suitability statistics: Accuracy, precision, negative predictive value, recall, and specificity.


None of the classifiers turned out to be best at all the five cross-validation statistics. However, logistic regression has the best performance at low probability of default thresholds. On the other hand, for higher thresholds, CART performs best in accuracy, precision, and specificity measures, while random forest performs best for negative predictive value and recall measures. 

Place, publisher, year, edition, pages
National Category
Probability Theory and Statistics
URN: urn:nbn:se:uu:diva-256089OAI: diva2:824593
Subject / course
Educational program
Master Programme in Statistics
Available from: 2015-06-24 Created: 2015-06-22 Last updated: 2015-06-24Bibliographically approved

Open Access in DiVA

fulltext(5434 kB)178 downloads
File information
File name FULLTEXT01.pdfFile size 5434 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
Department of Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 178 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 411 hits
ReferencesLink to record
Permanent link

Direct link