信用风险中的机器学习(英文版).pdf
2019 Machine Learning in Credit Risk Report 1 2019 Machine Learning in Credit Risk Report 2 INTRODUCTION The adoption and implementation of Machine Learning (ML) continues to progress and gather momentum in the financial services sector, as it has in other sectors including agriculture, health, and marketing. Within financial services, prominent areas of application have included credit risk and the detection of money laundering and fraud. The Institute of International Finance (IIF) has been analyzing financial institutions applications of ML in credit risk through various surveys and research papers.1 Building on our 1st comprehensive survey report in March 2018, this new report examines the continuing evolution and progress over the last year and a half. 2 This refreshed study covered five broad topics relating to:3 state of maturity; area(s) of application; benefits; challenges; regulatory engagement. Our study finds that the adoption of ML in credit risk modeling and management has increased significantly in the last year, in particular with a sharp increase in the number of financial institutions (FIs) running pilot projects with these techniques. Although the number of FIs using ML in production has only risen modestly, the sophistication of these ML models has increased significantly. The breadth of application across customer segments has also progressed significantly. Whereas applications were found in 2018 to be primarily for credit decisioning in retail portfolios, and also with some credit monitoring in the large corporate segment, this year has seen a sharp increase in usage for small and medium-sized enterprises (SME) portfolios. Our study found that the adoption of these ML techniques delivers numerous benefits to FIs, including improved model accuracy, overcoming data deficiencies and inconsistencies, and discovery of new risk segments or patterns. The adoption of the technology also presents new challenges, particularly those centered around supervisory understanding of or consent to use new processes, difficulty explaining processes, IT-infrastructure-related problems, and lack of appropriate talent. As with our 1st Edition survey, IIF staff surveyed a globally diverse sample of 60 firms, including 52 of the same firms that participated in 2018.4 The relatively subtle sampling changes did not 1 Beyond the IIF credit risk surveys, other reports include IIF Machine Learning in Anti-Money Laundering, Explainability in Predictive Modeling and Bias and Ethical Implications in Machine Learning. 2 This report presents an abbreviated public summary of the key themes of the IIFs Machine Learning in Credit Risk, 2nd Edition Detailed Report published on July 26, 2019. Distribution of that Detailed Report is limited to the official sector (supervisory community) and the 60 financial institutions that participated in the survey. 3 All figures and tables contained in this report are from our 2018 and 2019 survey results, unless otherwise stated. 4 IIF staff interviewed participant firms during the period of January to July 2019. While our survey and interviews were framed to be representative on a firm-wide basis, it is acknowledged that there may be limitations in some responses, given the scale of some of the participating firms, and the visibility of some individual interviewees. 2019 Machine Learning in Credit Risk Report 3 materially affect the overall results or any of the key trends presented in the report.5 The sample of participants spanned multiple firms on all continents, and a broad spectrum of firms scale.6 Noting that there are some differing views in defining “machine learning,” a broad, inclusive scope was applied for the purpose of this study, including approaches that conform to at least some of the distinctive machine learning features.7 MATURITY LEVELS In terms of maturity levels, there has been a significant increase in the number of firms that either have ML models in production or active pilot projects. Additionally, if we observe just the group of firms with pilot projects, we see an incredible surge compared to last years results. As with 2018, our results again indicate that adoption is not exclusive to developed economies, or to large firms (see Figure 1). Rather, maturity continues to be aligned to firms own business strategy and innovation agenda. Figure 1: Machine Learning Maturity Levels by Firm Size (Total Assets, USD) ML use for credit risk has risen sharply across all geographies over the past year. One particularly interesting case is that of Japan, a region that in 2018 had little use of ML but has seen a drastic increase in the number of pilot projects. 42% of FIs in our sample are now using ML in production, with a further 45% with ML in pilot projects, and 10% planning to start using ML in the next 6-12 months (see Figure 2). Only 3% of FIs in the sample (compared to 12% in 2018) have no plans to adopt machine learning in the credit risk function in the foreseeable future. 5 The geographical composition of the 2019 sample (and changes relative to 2018) was United States 10 (an additional 2), Canada 5, South America 4, Euro Area 14 (an additional 1), Other Europe 7, Asia excluding Japan 3 (2 fewer), Japan 6, Australia 5, and Middle East and Africa 6 (1 fewer). 6 The 60 participant firms include 16 with assets greater than $1t, 17 in the range of $500b to $1t, 14 in the range of $150b to $500b, and 13 below $150b. 7 See Appendix for a description of the four attributes of machine learning applied. 0% 20% 40% 60% 80% 100%$1t plus$500b-$1t$150b-$500bunder $150bFirms using ML in production Firms experimenting (pilot projects)Firms planning to use ML Firms with no ML plans2019 Machine Learning in Credit Risk Report 4 Figure 2: Maturity Level in Application of ML (2018-2019) APPLICATION IN CREDIT RISK The most common use and application cases of ML within credit risk are in the area of credit scoring and decisioning (see Figure 3). FIs have moved away from using ML for regulatory areas such as capital, stress testing, and provisioning, focusing ML application in areas such as credit monitoring and for collections, restructuring, and recovering. Many FIs specified that existing regulatory requirements do not always align with the direct application of ML as regulatory models need to be simple, whereas ML models can be more difficult (although not impossible) to interpret and explain. Figure 3: Application of ML by Areas of Usage (2018-2019) Key models developed through ML have been grouped into two main categories: those that use ML across multiple segments and products, and those that use ML for narrow segments and products. The level of sophistication and complexity of the ML model used ranges from “lower-layer ML use” (i.e., where ML is used specifically for a function in the model development process), “full ML model use,” and “full ML model use, including the use of unstructured data.” 0 5 10 15 20 25 30Firms using ML in productionFirms experimenting (pilot projects)Firms planning to use MLFirms with no ML plans2019 2018FIs strategic move has been to other credit risk areasOnly 10% in production compared to 15% in 2018.37% in production, compared to 23% in 2018 25% in production22% in production0%10%20%30%40%50%60%70%80%RegulatoryCapitalStressTestingEconomicCapitalProvisioning CreditScoring andDecisioningCreditMonitoring,Incl. EarlyWarningSystemsCollections,RestructuringandRecovering2019 20182019 Machine Learning in Credit Risk Report 5 FIs using ML across multiple segments and products are typically more mature in their use of ML techniques, having both a longer period of usage and covering several functions of the credit risk process (i.e., data cleaning/feature extraction, data exploration/segmentation, model development, and model validation). Additionally, most of these FIs have expanded the use of ML into additional portfolio types, such as SMEs or corporate portfolios. FIs using ML in a narrow segment have more diversity in the level of sophistication, with several firms using ML across several credit risk functions with more complex techniques, but with a focus on only one segment or market, and others using ML only as a lower-layer ML use, primarily for variable selection and segmentation. Firms continue to use ML in the model validation function, developing benchmark or “challenger” models built using competing modeling functions. However, its main function has been for model development, in particular, for model building and variable selection. One key insight is that MLs analytical power has allowed FIs to filter through several more variables in search for significant predictors. The trend of focusing on existing retail portfolios continues, given that this is where most FIs possess larger volumes of standardized, high-quality data. The number of FIs deploying models in production for SME portfolios has risen drastically since 2018 (see Figure 4). Figure 4: Application of ML by Portfolio Type (2018-2019) Another key insight is the increase in the number of FIs using natural language processing by analyzing sources such as news feeds, annual reports, network analysis of supply chains, or social media-sourced informationin particular, for models and pilots using ML for the development of early-warning signals for the monitoring of deteriorating credit. BENEFITS AND CHALLENGES Our findings indicate relatively little change in the benefits identified across 2018 and 2019 (see Figure 5).8 8 There were three benefit options that overlapped between the 2018 survey and 2019 survey: i) “increase model accuracy”; ii) “more efficiency in model development process”; and iii) “overcoming data deficiencies and inconsistencies.” 28% started new pilots23% in production, a sharp increase compared to 201822% new pilots started0%10%20%30%40%50%60%70%Retail SMEs Corporate Other non-retail2019 20182019 Machine Learning in Credit Risk Report 6 Figure 5: Change in Share of Overall Respondents Selecting Key Benefit Options* (% change from 2018 to 2019) Key benefits include more accurate models, overcoming data deficiencies and inconsistencies, and discovery of new risk segments or patterns (see Figure 6). MLs analytical power allows FIs to filter through many more variables and use them in a holistic manner, enabling the extraction of greater predictive insights. Moreover, unlike with traditional models where you need complete information, ML provides the ability to operate with incomplete or inconsistent data. This allows firms leveraging ML to better extract insights from large data sets that may contain deficiencies. Several institutions mentioned the ability of ML to develop strongly predictive models even when using highly correlated, skewed data. Another improved outcome of leveraging ML is its ability to develop a multitude of models, number of targets, and variety of design constructs that allow FIs to identify complex patterns for segments of the population and gain a greater granular understanding of these patterns, enabling firms to expand services to new customer segments. Indeed, machine learning can be a powerful force for financial inclusion and its potential is alluring. Figure 6: Key Benefits of Applying ML Compared to Previously Used Approaches Share of Respondents -10% -5% 0% 5% 10%Increased model accuracyMore efficiency in model developmentprocessOvercoming data deficiencies andinconsistencies* The three benefit options that were listed in both the 2018 and 2019 surveys.0% 10% 20% 30% 40% 50% 60% 70%OtherCost savingsAbility to conduct holistic analysis ofdifferent data sourcesOvercoming data deficiencies andinconsistenciesMore efficiency in model developmentprocessDiscovery of new risk segments orpatterns that were previously unnoticedIncreased model accuracy2019 Machine Learning in Credit Risk Report 7 While the benefits have been reasonably stable, firms perceptions of their key challenges have evolved over the past year, encountering more challenges as their knowledge and familiarity with ML has increased (see Figure 7). Of the seven challenge options across both survey editions, five were identified by more respondents this year.9 Figure 7: Change in Share of Overall Respondents Selecting Key Challenge Options * (% change from 2018 to 2019) Some of the most common challenges this year include supervisory understanding of or consent to use new processes, difficulty of explaining processes, IT infrastructure-related problems, and lack of appropriately skilled staff (see Figure 8). Figure 8: Key Challenges of Using ML Compared to Previously Used Approaches Share of Respondents 9 There were seven challenge options that overlapped between the 2018 survey and 2019 survey: i) “difficulty of validating results”; ii) “governance issues”; iii) “supervisory understanding of or consent to use new processes”; iv) “lack of support from key stakeholders”; v) “difficulty of explaining processes”; vi) “cost of implementing technology”; and vii) “IT infrastructure-related problems.” -100% 0% 100% 200% 300% 400% 500%Lack of support from key stakeholdersDifficulty of validating resultsDifficulty of explaining processesCost of implementing technologyGovernance issuesIT infrastructure-related problemsSupervisory understanding of or consent touse new processes* The seven challenge options that were listed in both the 2018 and 2019 surveys.0% 20% 40% 60% 80%OtherLack of support from keyGovernance issuesDifficulty of validating resultsAvailability of appropriately skilledIT infrastructure-related problemsData qualityCost of implementing technologyDifficulty of explaining processesSupervisory understanding of or2019 Machine Learning in Credit Risk Report 8 Supervisory understanding of or consent to use new processes was a major challenge for FIs this year. This increased focus is likely attributable in part to the developing awareness by FIs of their regulators understanding and positions toward ML, thanks to the growing engagement between the two groups on this subject. In fact, most of the FIs using ML in production have engaged their supervisor in their application of ML in credit risk, an increase of by 56% in comparison to 2018. Most of the FIs that have not yet engaged with their supervisor indicated they plan to do so in the upcoming six months. While interactions have been commonly described as “constructive,” there are nevertheless some issues to overcome. Due to the significant potential and dramatic rise of ML in financial services, combined with the steep learning curve of the technology, it is unsurprising that some supervisors are backlogged and struggling to keep pace in this fast-moving space. This helps explain the sentiment among some FIs that there is a lack of clear guidance by supervisors on issues such as the level of “explainability” needed to deploy ML models into production. FIs once again noted t