The EBA identifies several main areas where ML may be used: risk differentiation, risk quantification, and model validation (where ML models may be used as challenger models) as well as some additional areas, such as input preparation for the main IRB models (for instance, collateral valuation). Less focus is required if ML models are used at lower levels in the process of IRB calculations. Other considerations may play a role, such as the legal or ethical aspects as well as consumer and data protection.
In the main areas, challenges appear in the availability, quality and representativeness of data. Human judgment, a regulatory requirement for a sound calculation process, may prevent the use of the more complex ML models, due to the difficulty of explaining the outcomes.
The complexity, reliability and interpretability of ML model results is a key challenge. For some types of ML models, documenting the underlying assumptions and theory as required by regulation is challenging. Similar explainability challenges appear during the validation process or in the IRB governance.
As for the benefits, ML models may improve the risk differentiation, both by improving the discriminatory power or by identifying all relevant risk drivers. Similarly, the risk quantification benefits by an improvement in predictive ability or in the detection of material biases. ML tools are also beneficial in data collection and preparation processes.
Principles of Prudent Use of ML
Rather than describing which ML tools may be used in which aspects of the IRB process, EBA gives recommendations in the form of a principle-based approach. Banks must have an appropriate level of knowledge of the ML model’s functioning in the model development (MD), the credit risk and control (CRCU) and the validation units. Senior management has to be in position to understand the models. Furthermore, unnecessary complexity should be avoided when building the models.
ML models used in IRB processes should be properly documented, and if human judgment is used in the model development or to override the outcomes, the staff in charge should be in position to assess model assumptions.
Particular attention needs to be put into the validation of these models. Sufficient data quality needs to be ensured and unstructured data carefully used. The rationale behind the choice of hyperparameters, model stability and overfitting should be thoroughly assessed.
The Experian Approach
Now that the regulators are setting the path to using ML models in the calculation of capital, a crucial step in the process is the ability to explain the model outcomes. When consulted about the use of ML models, our clients often mention the explainability issue as one of their major concerns. Experian has developed an advanced approach to improve the explainability of ML models, so that the outcomes of these models can easily be explained and understood by stakeholders or customers.
Explanations are in the form of variable importance ranking which can then easily be translated into reason codes. This is achieved by different advanced techniques which derive an importance score for each feature based on partial dependencies and SHAP values. SHAP quantifies the contribution that each feature brings to the prediction made by the model. From that, a confidence level is taken to understand the real importance of individual features.
Experian has also developed a standardised framework for developing and deploying ML models with the required level of explainability. This allows the process to be 50% faster than the normal modelling processes.
Experian’s Explainability plug-in is compatible with tree-based ML models.