Indi Assignment
Indi Assignment
Indi Assignment
So,
talking in brief about the company, (IBM) is an American multinational technology corporation
headquartered in New York, and operates in over 171 countries. It produces and sells computer
hardware, middleware and software, and provides hosting and consulting services in areas ranging
from mainframe computers to nanotechnology.
We have particularly tried to analyze and develop insights regarding attrition problem. Attrition is
something which impacts all businesses, irrespective of geography, industry and size of the
company. And can lead to significant costs for a business and impact employee morale. Even at IBM,
attrition has been rising above the industry average. The executives need analysis on key factors
causing attrition and diagnose which departments and job functions are impacted the most. This will
help them in tracking the trends in attrition and come up with strategies to minimize it.
We have captured the crucial terms involved in the dataset and we can go through those terms
specifically which have significance for understanding the analysis n insights. Data involves
information about basic employee demographics, his or her department, information related to the
job profile of the employee like what was the role, job level, whether the employee was satisfied
with the profile and environment. Also, it contains information about working hours, monthly
income, employee performance, number of times the employee has been promoted, etc
Lending Club. peer-to-peer lending company is a San Francisco, California based company
specializing in loans at low interest rates to the borrowers. platform that aims to connect people
who need money with those who have the money to invest
Here I have the dataset from Lending Club’s operation for the time period 2009 to 2014. It’s pretty
old dataset. The dataset contains a total of 113937 loans along 81 variables on each loan. This
dataset is financial dataset and this is related to the loan, borrowers, lenders, interest rates and
stuffs like that. In this dataset, I am using the data from the Lending Club to analyse it and trying to
find the pattern in that data.
So, basically, this is the dataset and as you can see it sir, its huge n detailed n it would be tedious to
go through it entirely. So as to still understand its essence, I have captured the crucial terms involved
in the dataset and we can go through those terms specifically which have significance for
understanding the analysis n insights.
Firstly, we have Listing Key, Listing Number, Creation Date which will be like a unique key for when
the listing is created and it will be a number which will uniquely identify the transaction. Member
key Unique numeric value associated with the borrower.
LoanKey Unique key for each loan or Unique numeric value associated with the loan.
Secondly, another very fundamental variable involved will be term, so which is basically length of
the loan.
Next, we have Loan status, which can either be Cancelled, Chargedoff (to be in default, to debt that
a company believes it will no longer collect as the borrower has become delinquent on payments),
Completed, Current, Defaulted, FinalPaymentInProgress, PastDue. So customers will be categorised
under these loan status categories.
Then we have Closed date which will be accordingly applicable for Cancelled, Completed, Chargedoff
and other loan statuses.
Then there is Borrower’s Annual Percentage rate which is the yearly interest generated by a sum
that's charged to borrowers or paid to investors. APR is also the annual rate of interest paid on
investments without accounting for the compounding of interest within that year.
Lender yield is the annual net profit that an investor/lender earns on an investment. Mathematically
it will be equal to the interest rate on the loan less the servicing fee.
There’s something called as rating which is assigned at the time the listing was created: This is the
rating which has been assigned with respect to the riskiness of the security. The rating scale used is
as follows; AA – A – B – C – D – E – HR. This is in the in ascending order of riskiness.
There is a score assigned to each listing which is basically a custom risk score which has been built
using historical loan data. The score ranges from 1-11, with 11 being the best, or lowest risk score
There is listing category which talks about nothing but the purpose of loan.
The occupation of the borrower, the employment status of the borrower, the length in months of
the employment status, the monthly income of the borrower or the income range of the borrower.
the credit score to represent the creditworthiness of an individual as provided by a consumer credit
rating agency.
If there are any delinquencies (behind on payments. Once you are delinquent for a certain period of
time your lender will declare the loan to be in default), what’s the total amount,
Information about the prior loan status of borrower, total number of loans he or she had before
listing. total on time payments the borrower made on loans. Total principal borrowed on loans at
the time the listing was created., principal outstanding on loans at the time the listing was created.
Needless to say, that these values will be null if the borrower had no prior loans.
-------------------------------------------------------------------------------------------------------------------------------------
The first question that needs to be asked is HOW LONG PEOPLE USUALLY OPT FOR LOAN? Let’s
answer this question with a pie chart and stacked bars. As we can see, LendingClub offers loans for a
period of 1 3 or 5 years only. We can see that people don’t really loan any amount for less than one
year and the most popular loan amount is of 3 years although some people do choose for 5 years as
well.
Based on the pie chart we can see that 77.04% of the borrowers have opted for the 3-year loan term
making it mostly sought-after loan term.
This bar graph segregates the number of loans that have been defaulted, completed, cancelled or
are still ongoing based on the color scheme. So, if we consider a term of 3 years, around 4800 loans
have defaulted, around 36000 loans are currently ongoing, around 34000 loans have achieved the
status of complete, 10000 loans are soon to be default, and around 5 loans are cancelled.
The final graph gives a snapshot of the status of total loans disbursed in the 5-year period. We can
see that 38074 loans have been completed and 56576 loans are current. Whereas, 11 992 loans
have been charged off and 5018 defaulted.
LendingClub can focus on the other two plans in order to increase its customer base.
-------------------------------------------------------------------------------------------------------------------------------------
As a financial institution understanding our customer base who have been good with their
repayment, who have not been good with their repayment is extremely important. That is the
reason, I further delved into customer analysis. Also, to understand which customer base should the
company look to attract more in the future and which segment of customers should the company
reduce their exposure to in order to make their loan books stronger.
In the top you see a mapping median borrower rate versus the average default rate and plotting the
profession of loan borrowers, we are able to gather useful insights. Professions such as Doctors,
Computer programmers, Judge, professors etc. have the lowest borrower rate and almost negligent
default rates because of their high paying jobs. The company should continue to have a good
exposure to such loans for safe cashflow.
In terms of providing riskier loans for higher profits, lower income professionals such as nurses, Bus
drivers and teacher aides are much safer bets than students with the exception of graduate & senior
students. The lower paid professionals take out loans at a higher rate and have much lower default
rate. We see similar rates being offered to drivers and military officers who have extremely low
default rates compared to various types of students. Hence, company can look into reducing
borrower rate for groups of people who have a solid past record. This might also help the company
grow in that customer base.
Focusing on Students, I drew a default rate amongst various kinds of students. Sophomore and
junior college students have taken lesser loans but have an abnormally high default rate nearing 10-
15%. These are really risky loans and exposure to such loans should be avoided by Lending Club or
kept to a limited quota through a thorough vetting process.
Another way to reduce default or charged off loans, the customer base of individuals earning $50000
or higher should be targeted as they have an average default rate of 2%. Going below 50,000
increases that to about 5-5.5% continuing on to almost 9% in no income groups. Special marketing
and ad campaigns offering good benefits to high income individuals should be created to bring in
more such customers and, in that process, reduce chances of loan default.
Recommendations:
· Lower income professionals should be targeted over students because they have lesser
average default rates.
· If at all students are targeted, among students, College seniors and graduate school students
should be focused on as they have the lowest default rates.
· Loans towards no income borrowers should be kept minimum or a fixed quota should be
created, as they have an extremely high default rate.
So basically, this will be an important strategic choice for company like basis the default rate and
income range it can accordingly pursue the cohort.