Utveckling av beslutsstöd för kreditvärdighet
Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
The aim is to develop a new decision-making model for credit-loans. The model will be specific for credit applicants of the OKQ8 bank, becauseit is based on data of earlier applicants of credit from the client (the bank). The final model is, in effect, functional enough to use informationabout a new applicant as input, and predict the outcome to either the good risk group or the bad risk group based on the applicant’s properties.The prediction may then lay the foundation for the decision to grant or deny credit loan.
Because of the skewed distribution in the response variable, different sampling techniques are evaluated. These include oversampling with SMOTE, random undersampling and pure oversampling in the form of scalar weighting of the minority class. It is shown that the predictivequality of a classifier is affected by the distribution of the response, and that the oversampled information is not too redundant.
Three classification techniques are evaluated. Our results suggest that a multi-layer neural network with 18 neurons in a hidden layer, equippedwith an ensemble technique called boosting, gives the best predictive power. The most successful model is based on a feed forward structure andtrained with a variant of back-propagation using conjugate-gradient optimization.
Two other models with a good prediction quality are developed using logistic regression and a decision tree classifier, but they do not reach thelevel of the network. However, the results of these models are used to answer the question regarding which customer properties are importantwhen determining credit risk. Two examples of important customer properties are income and the number of earlier credit reports of the applicant.
Finally, we use the best classification model to predict the outcome of a set of applicants declined by the existent filter. The results show that thenetwork model accepts over 60 % of the applicants who had previously been denied credit. This may indicate that the client’s suspicionsregarding that the existing model is too restrictive, in fact are true.
Place, publisher, year, edition, pages
2013. , 80 p.
Credit Scoring, Data mining, Imbalanced data sets, Sampling techniques, SMOTE, Classification techniques, Predictive modeling
Other Computer and Information Science
IdentifiersURN: urn:nbn:se:liu:diva-97223ISRN: LIU-IDA/STAT-G--13/005—SEOAI: oai:DiVA.org:liu-97223DiVA: diva2:645691
OKQ8, KnowIT Decision Linköping
Subject / course
Program in Statistics and Data Analysis
2013-06-07, Visionen, 581 83, Linköping, 14:00 (Swedish)
Sysoev, Oleg, PhD in Statistics
Wahlin, Karl, PhD in Statistics