Logistic Regression Model Application in Research
Logistic
Regression Model for Bank loan Default Selection
A
logit model (or logistic regression model) is a statistical method used to
predict the probability of a binary outcome (like yes/no, success/failure)
based on one or more independent variables.
It transforms a linear combination of predictors into probabilities between 0
and 1 using the logistic function.
Example
Imagine
a bank wants to predict whether a loan applicant will default:
- Dependent variable: Default (1 = yes, 0 = no)
- Independent variables: Income, credit score, age
- The logit model estimates how
these factors influence the probability of default.
Let’s build a logit model example
for predicting bank loan default using the three independent
variables we mentioned: Income, Credit Score, and Age.
Example: Bank
Default Prediction
Model
Setup
- Dependent variable (Y): Default (1 =
default, 0 = no default)
- Independent variables
(X):
- (X_1):
Income (in $1000s per month)
- (X_2):
Credit Score (scaled 300–850)
- (X_3):
Age (in years)
Example Calculation
For a borrower with:
- Income
= 5 ($5000/month)
- Credit
Score = 650
- Age
= 40
The logit is: [ \text{logit}(p) = 2.0 + (-0.3)(5)
+ (-0.01)(650) + (0.05)(40) ] [ = 2.0 - 1.5 - 6.5 + 2.0 = -4.0 ]
Convert to probability: [ p = \frac{1}{1 +
e^{-(-4.0)}} = \frac{1}{1 + e^{4.0}} \approx 0.018 ]
Probability of default ≈ 1.8%
Interpretation
- Income effect: Each extra
$1000/month reduces odds of default by about 30%.
- Credit score effect: Each point
increase reduces odds by ~1%.
- Age effect: Each year
older increases odds by ~5%.
So, this borrower has a low risk of
default given their decent income and moderate credit score.
simulate a small dataset (say 10
borrowers) and show how the logit model assigns different default probabilities
across them?
Simulated Bank Default Probabilities
|
Borrower
ID |
Income
($1000s) |
Credit
Score |
Age |
Probability
of Default |
|
1 |
5.00 |
608 |
41 |
0.0285 |
|
2 |
9.61 |
643 |
63 |
0.0153 |
|
3 |
7.86 |
791 |
44 |
0.0023 |
|
4 |
6.79 |
713 |
68 |
0.0226 |
|
5 |
3.25 |
685 |
46 |
0.0286 |
|
6 |
3.25 |
491 |
61 |
0.3026 |
|
7 |
2.46 |
576 |
47 |
0.1045 |
|
8 |
8.93 |
460 |
35 |
0.0285 |
|
9 |
6.81 |
759 |
34 |
0.0026 |
|
10 |
7.66 |
613 |
66 |
0.0420 |
Insights
- Borrower 6 has the highest risk (30%) because of low
income and very low credit score.
- Borrower 3 and 9 have very low risk (<0.3%) thanks to strong
credit scores and decent income.
- Age effect: Older borrowers (like Borrower 10 at age 66) show
slightly higher probabilities compared to younger ones with similar
profiles.
- Income effect: Higher income generally reduces default risk (Borrower
2 with $9.61k/month has only 1.5% risk).
- Credit score effect: This is the strongest driver—low scores sharply
increase default probability even if income is moderate.
Step‑by‑Step in SPSS GUI
- Open your dataset
- Make sure your dependent
variable (Default) is coded as 0 = No Default, 1 = Default.
- Independent variables: Income,
CreditScore, Age.
- Navigate to Logistic Regression
- Click on Analyze in the
top menu.
- Go to Regression → Binary
Logistic…
- Select the Dependent Variable
- In the dialog box, move Default
into the Dependent field.
- Select the Independent
Variables
- Move Income, CreditScore, and Age
into the Covariates field.
- Choose Method
- By default, SPSS uses Enter
(all predictors entered at once).
- You can also choose Forward
or Backward stepwise methods if you want SPSS to select variables
automatically.
- Set Options (Optional)
- Click Options… to
request classification tables, Hosmer-Lemeshow test, confidence intervals,
etc.
- Click Save… if you want
SPSS to generate predicted probabilities or group membership for each
case.
- Run the Analysis
- Click OK to run the
logistic regression.
SPSS Output
- Omnibus Test of Model
Coefficients → checks if predictors improve
the model.
- Model Summary → −2 Log Likelihood, Cox & Snell R², Nagelkerke
R².
- Classification Table → how well the model predicts default vs no default.
- Variables in the Equation → coefficients (β), standard errors, Wald test, odds
ratios (Exp(β)).
Suggestion:
Any bank can apply this model to evaluate
loan applicants by defining a set of independent variables and entering the
relevant data into SPSS. The analyzed results can then be reviewed to support
informed decision-making on whether to approve a loan for a particular
applicant.
Comments
Post a Comment