Validation of Risk Recommendation model

Subhashini Sharma Tripathi
4 min readApr 19, 2022

Dear reader, we at Pexitics have launched the ItP (Intention to Pay) Q12 customer survey, which helps profile borrowers across LMRS (Law abidance, Morality, Responsibility and Self Preservation). This profile creates an LMRS score. The LMRS scores is segmented to create the decision recommendation: AVOID; CAUTION-High; CAUTION-Medium; CAUTION-Low; PROCEED.

The Psychology of Defaulters

Please check out the details of the ItP Q12 score at https://pexitics.com/index.php/pexitics/behavioral-risk-for-loans/

As mentioned, the ItP Q12 score is most efficient to understand the credit risk for New — to- Credit customers and for customers who have a credit bureau score which is sub-optimal (below the lending norms of an organization)

Today in this exercise, I will be validation the accuracy of the decisions. I am using python for this exercise.

Step 1 : We start with Exploratory Data Analysis : to understand the dataset .

import pandas as pd

df = pd.read_csv (“path to dataset//dataset .csv”)

The profiling built into pandas is a very smart way to do the EDA .

In case you do not have it downloaded, use the code # conda install pandas-profiling

# Exploratory analysis

import numpy as np

from pandas_profiling import ProfileReport

profile = ProfileReport(df,explorative=True)

#to view result in jupyter notebook or google colab

profile.to_widgets()

# to save results of pandas-profiling to a HTML file

profile.to_file(“EDA.html”)

This Profiling enables me to see the basics of the data starting with

1. Overview of the data

Dataset statistics

a. Number of variables 14

b. Number of observations 3944

c. Missing cells 0

d. Missing cells (%) 0.0%

e. Duplicate rows 0

f. Duplicate rows (%) 0.0%

g. Total size in memory 1.5 MiB

h. Average record size in memory 401.3 B

Variable types

i. NUM 8

j. CAT 6

2. A numeric and pictorial (graphical description of each variable). Delinquency is the ‘Y’ variable or variable we want to predict through the variables we have scored thru the ItP Q12 — L,M,R,S (Law Abidance, Morality, Responsibility and Self Preservation) . The Combination of these variables create the LMRS node . The Decision variable is an outcome of the micro-segmentation model which helps create the 5 decision .

Profiling in pandas

3. Correlations

4. Missing values per variable

5. Top rows printed

6. Bottom rows printed

Thus, this is a comprehensive report on Exploratory Data Analysis — easy to run and save.

Step 2: We need to understand the efficiency of the LMRS Decision Triad. Read more on the LMRS Decision Triad (matrix) in the whitepaper here (https://pexitics.com/wp-content/uploads/2022/04/ItP-Q12-A-detailed-Whitepaper.pdf)

The Default customer flag in this dataset is Delinquency. The Important factor is to understand that though the decisions are divided into 5 part, the actual -on ground decision is primarily divided into only 2 — AVOID & PROCEED

Let us tabulate the frequency table to understand the effectiveness of the ItP Q12 recommendations.

# frequency table

table = pd.crosstab(df.DECISION, df.Delinquency,margins=True)

# export the table as a csv file

table.to_csv(“table.csv”)

Let us create the Confusion matrix, to better understand effectiveness.

The ItP Q12 Decision Triad converted to final Reccomendation for Lending

Note: The Matrix is created ONLY for the Recommendation for Lending (AVOID + PROCEED) data. The CAUTION — Medium as per the ItP Q12 Decision triad is not used since it refers the case to credit. Thus, a clear Recommendation is AVOID / PROCEED.

Insight: This model is very effective in its decision recommendations.

Insights

It has a high accuracy — 94% — for predictions. If we break up the precision as per the correct prediction of Delinquency @ 80% of total delinquency predictions. Also, the loss of business is minimal as the correct non-delinquent prediction lies at 98%.

Did you find this interesting? Continue reading my blog to understand if this effectiveness of the model is consistent across Loan products.

--

--

Subhashini Sharma Tripathi

My passion is Intelligence Amplification - using tech and data to make great decisions. Currently, am bullish on Generative AI for Banking solutions.