Originally published in KDNuggets.com
More than a decade back while joining a large US Credit Card company, it was surprising to see that Predictive Analytics was limited to multivariate regression and logistic models. This was in contrast to previous stints at Start-Ups funded by NASA / NIST where a broader set of Machine Learning (ML) methods including SVMs, NNs, Random or Gradient Boosting Trees were regularly applied.
There were a number of good reasons for using the simpler models in Retail Lending. Firstly, Decision Frameworks were already in place that made input feature selection a relatively simpler exercise. For e.g., for Credit Decisioning, one could think in terms of 5Cs of Credit (Character, Capacity, Capital, Collateral, Conditions), and search for Data variables that catered to them. It wasn’t as hard as using Deep Learning to creating features from raw Images. Secondly, the relationship of the target variable with the inputs were not complex for e.g. Credit risk has a smooth inverse relationship with Income. One did not really need a Radial basis function to transform Income into a higher dimensional space, like one needed for SVM based Image classification. Thirdly, unlike today, Training and Deployment Platforms were not amenable to complex methods. Finally, a commonly stated reason was Model explainability (though experienced users of advanced ML models will find this debatable).
With time, the above-mentioned ML methods started getting explored as open source packages became common and Data coming in different forms. However, the primary value for a Business came from identifying new powerful Data that could significantly improve Customer level Decisions. While Alternate Data Sources will always be a focus area, there are certain specific business problems which can be handled much better with new ML Algorithms that have become prevalent only in the last few years. Here, we discuss three such algorithms focusing on their application in Retail Lending.
RNNs / LSTMs:
Sequence data from Bank Deposit, Loan or Credit Card transactions can be used to generate powerful insights and actions. Some example use-cases:
Common Statistical and ML algorithms are not well structured to handle this type of data. While Statisticians have traditionally created features (for e.g. different time window Averages) that try to capture some of the trend information, LSTM (Long Short-Term Memory) networks are a class of recurrent neural networks that are specifically built to learn from sequence data.
Recommender Systems have been popularized by its use in Retail (Amazon), Web Streaming (Netflix) and Knowledge Sharing (Quora). Many of its implementations use Matrix Factorization which is a traditional Linear Algebra formulation that was made feasible through faster computational capabilities. Following are couple of use-cases that apply to Retail Lending:
It’s a no-brainer that Deep Learning has been the most visible new age Machine Learning algorithm developed in the last 5 years, with marked success in generating insights from Large Unstructured Datasets of Images, Audio and Text. Some example use-cases for Retail Lending:
Modern end-to-end Big Data Platforms available today provide the computational power to train new age ML algorithms and streamline their deployment. Variable Importance measures like Partial Dependence and Distance to Decision Boundary, can help in Model Explainability. It is still important to use appropriate technique for a given analytical problem and avoid complexity. Model Robustness, Incremental Business value, Customer experience, Implementation and Governance should be considered paramount. Machine Learning and the deluge of Alternate Data Sources has certainly paved way for more exciting “Modeling” times in Retail Lending.
About the Author:
Jayesh Ametha is Retail Banking Professional with 15+ years in Business Strategy, Credit Risk and Advanced Analytics.