Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rebuilding Trust in Black-Box Models: Using Explainable Machine Learning (SHAP) to Analyze Feature Impact Across Models for Bankruptcy Prediction
Dalarna University, School of Information and Engineering.
Dalarna University, School of Information and Engineering.
2025 (English)Independent thesis Advanced level (degree of Master (One Year)), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

This thesis aims to enhance the interpretability of ensemble machine learning (ML) models for bankruptcy prediction using SHAP (SHapley Additive exPlanations). Traditional statistical models, such as logistic regression, lack the ability to capture non-linear relationships in financial data, while ensemble models like Random Forest (RF) and XGBoost (XGB) excel in predictive accuracy but are difficult to interpret. The research bridges this gap by applying SHAP to transform these black-box models into transparent systems, making them more actionable and trustworthy for financial institutions. Using a dataset of 10,696 companies from the Swedish hospitality sector (1998–2021), the study addresses class imbalance with SMOTE-ENN and evaluates Logistic Regression, RF, and XGB. Results reveal that XGB captures more complex, non-linear patterns, achieving the highest accuracy and outperforming RF and Logistic Regression. SHAP analysis identifies key financial ratios, such as retained earnings to total assets and working capital to total assets, as the most influential predictors. Results demonstrate that XGB outperforms LR and RF in predictive accuracy by capturing complex, non-linear feature interactions. SHAP analysis identified significant contributors, including features withweaker or negative correlations, particularly in RF and XGB. In contrast, LR exhibited simpler, linearrelationships, aligning more closely with traditional correlation metrics. This research underscores the valueof explainable ML in enhancing decision-making, ensuring regulatory compliance, and fostering trust inML-based bankruptcy prediction. By combining accuracy with interpretability, it provides a robust framework for analyzing high-dimensional, imbalanced datasets in financial analytics. 

Place, publisher, year, edition, pages
2025.
Keywords [en]
Bankruptcy Prediction, Machine Learning, XGBoost, Random Forest, Logistic Regression, SHAP, SMOTE-ENN, Altman’s Z-Score, Financial Ratios, Explainable ML, Class Imbalance, Theoretical Ranking, Data driven distribution, Feature importance
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:du-50136OAI: oai:DiVA.org:du-50136DiVA, id: diva2:1935353
Subject / course
Microdata Analysis
Available from: 2025-02-06 Created: 2025-02-06

Open Access in DiVA

fulltext(6018 kB)66 downloads
File information
File name FULLTEXT01.pdfFile size 6018 kBChecksum SHA-512
c5ebc5ec1e869b332c8eb2aa38a2cf398c8a943323d8c7acdcb9d4717adaba6c9e1dfd00a0c871d038058f55fed11348b1be82064f648c5d25881cd781286c7c
Type fulltextMimetype application/pdf

By organisation
School of Information and Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 66 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 209 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf