Neener Analytics Fair Lending Analysis
November 29, 2016
The principals and requirements of fair lending were established through the Equal Credit Opportunity Act (ECOA, 12 CFR Part 1002 Regulation B) and the Fair Housing Act (FHA, 42 U.S.C. 3601 et seq.) which prohibit the discrimination in lending or credit transactions. It is vital that a lender’s practices do not unfairly affect borrowers based on their age, gender, ethnicity, religion, marital status, or national origin. Examples of unfair practices include: differential approval rates, interest rate availability, loan charges, or access based on any of these factors. At Neener Analytics, we focus on providing risk evaluation and marketing solutions for all consumer-facing businesses with special attention paid to methods of risk assessment for unbanked or thin-file, no-file or consumer’s challenged by traditional methods. These individuals have often already been differentially penalized by traditional rating measures do to their lack of opportunity (due to age or other factors) to establish a traditional financial history. Instead we focus on understanding the borrower, and based on their personality, their desire and capability of working with the lender to maintain their contractual obligation. However, even though the personality indicators that we look for are universally possessed in some measure among all human beings, it is critical that we assess whether or not our approach to understanding an individual's personality differentially effects individuals based on any of the factors laid out in the ECOA or FHA. To that end, we present the following analysis of our statistical analytics algorithms performed under the guidance of Version 2.0 of the CFPB Supervision and Examination Manual.
The data for our analysis comes from a sample of individuals which had been approved for a loan by various traditional lenders and had registered a social media account with Neener Analytics. We had 1200 individuals which met these criteria. For this analysis, we considered the individual's name, address, age, income, and monthly loan amount. Our borrower’s average age was 44.5 years with a standard deviation of 12.1 years.
For determining a borrower's likely ethnic group we utilized the WRU package (Imai & Khanna, 2016) for the R software language. The package utilizes the 2010 U.S. census data to identify the probability of an individual as being White, Black, Latino, Asian, or other. The package uses an approach based on Bayesian statistics which assigns a probability to each category based on the individual's surname, age, and census tract. The underlying WRU model was trained using voter registration and census data from Florida. The model has a reported 15.8 percent overall error rate. Within our population 59.4% individuals were classified as White, 23.6% classified as Black, 15.9% classified as Hispanic, <.1% classified as Asian, and <.1% as other. Other than the low sample of Asians in our analysis, the ethnic/racial distribution of our sample is reflective of the US 2010 Census racial distribution. Due to the low number of individuals who were identified as Asian or other, these groups were removed from subsequent analysis.
For determining borrower's likely gender we utilized the Gender package (Blevins & Mullen, 2015). This package determines an individual's likely gender using their first name and the year of their birth. Incorporating an individual's birth year is important for our data because of the wide range of ages of our borrowers and the change of some names between majority male or female over that time. For each name, individuals were assigned to the most likely gender. For our population 37.1% of the borrowers were male and 62.9% of the borrowers were female.
Our lending model looks at identifying individual's which exhibit personality markers indicating that they are unlikely to fulfill a loan obligation. For each individual we provide a prediction which is a risk proxy for lending to that individual. The prediction is derived by first identifying an individual's distribution on the Neener Personality Index©. The Neener Personality Index is a combination of several different personality measures. The individual's prediction is determined through automated software which looks at the individual's interactions on social media. A second statistical model is then used to identify the risk associated with lending to an individual with the particular personality profile.
Fair Lending Analysis To conduct the fair lending analysis we analyzed the prediction produced by our model for each individual and examined the degree to which racial or gender specific factors could explain an individual's Neener prediction result. We utilized a step-wise linear regression procedure, starting with a model including the main effects for gender, race, age, monthly income, and loan amount. We used an alpha value of .05 to determine significance of the model fit. The overall model yielded a significant fit, p = <.001. Age was the only significant predictor, p < .001. A graphical depiction of the obtained t-values for the individual variables is show in Figure 1. We then tested a second model including the pairwise interactions between the terms. A chi-square test for nested models did not show the additional features as providing a significantly better fit to the data.
Looking in more detail at the relationship between Age and an individual’s Neener prediction, we find a modest positive correlation of .21. However, this finding is well within regulation parameters established in the Equal Credit Opportunity Act, 12 CFR Part 1002.6(b) section 2. In addition, our observed relationship is lower than that seen in FICO scores (.34, Bernerth, 2012). We can correctly identify the young, trustworthy individuals, without unfairly biasing against older individuals.
Fair lending is very important to Neener Analytics, in fact, our whole purpose is to reduce the uncertainty in lending to thin-file, no-file or consumer’s challenged by traditional methods. These individuals are most often from groups where disparate lending practices are common. Our analysis shows that our Neener Prediction does not bias against individuals based on their race or gender. In addition, our scoring process does a better job for leveling the playing field for individuals across age groups than a more traditional FICO score. By focusing not on can a borrower pay back a loan, but will they pay back the loan, we can start to move away from questions of financial history and instead focus on an intimate view of an individual’s personality and proclivity for paying back the loan that allows us to reach disparate populations that are overlooked because of their thin file, no-file or challenged status.
Bernerth, J. B. (2012). Demographic Variables and Credit Scores: An empirical study of a controversial selection tool. International Journal of Selection and Assessment, 20(2).
Cameron Blevins and Lincoln Mullen, "Jane, John ... Leslie? A Historical Method for Algorithmic Gender Prediction," Digital Humanities Quarterly 9, no. 3 (2015)
Imai, K., & Khanna, K. (2016). Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records. Political Analysis, 24(2), 263-272