Bias and Fairness of AI-based systems within Financial Crime
July 25th, 2022
When it comes to fighting financial crime, challenges exist that go beyond the scope of merely stopping fraudsters or other bad actors. Some of the newest, advanced technologies that are being launched often have their own specific issues that must be considered during adoption stages. Addressing one of these concerns, when AI-based systems are used to gain operational efficiencies in financial crime, model fairness and data bias may occur when a system is skewed for or against certain groups or categories in the data. Typically, this stems from erroneous or unrepresentative data being fed into a machine learning model. Biased AI-systems may particularly represent a serious threat when reputations may be affected. In fraud detection, as one example, biased data and predictive models could erroneously associate last names from other cultures with fraudulent accounts, or falsely decrease risk within population segments for certain type of financial activities.
Data bias occurs when available data is not representative of the population or phenomenon of exploration. Data does not include variables that properly capture the phenomenon we want to predict. Data includes content produced by humans which may contain bias against groups of people. Systematic flaws in thinking and reasoning, usually inherited by cultural and personal experiences, lead to distortions of perceptions when making decisions. And while data might seem objective, data is collected and analyzed by humans, and thus can be biased.
A biased dataset can be unrepresentative of society by over- or under-representing certain identities in a particular context. Biased datasets can also be accurate but representative of an unjust society. In this case, they reflect biases against certain groups that discriminate against that particular group face.
Data bias in machine learning (an integral part of AI-based system) is a type of error in which certain elements of a dataset are more heavily weighted or represented than others. A biased dataset doesn’t accurately represent a model’s use case, resulting in skewed outcomes, low accuracy levels, and analytical errors. Machine learning bias is sometimes called algorithm bias or AI bias.
Predictive models may be trained on data containing human decisions or data that reflects second-order effects of societal or historical inequities. As a result, a model may generate unfairness within its decisions or insights.
While there isn’t a silver bullet when it comes to remediating the dangers of discrimination and unfairness in AI systems or permanent fixes to the problem of fairness and bias mitigation in architecting machine learning model and use, these issues are worth considering for both societal and business reasons.
Doing the Right Thing in AI
Addressing bias in AI-based systems is not only the right thing, but the smart thing for business—and the stakes for business leaders are high. Biased AI systems can:
- Allocate opportunities, resources, or information unfairly
- Infringe on civil liberties
- Pose a detriment to the safety of individuals
- Fail to provide the same quality of service to some people as others
- Impact a person’s well-being if perceived as disparaging or offensive
In fact, there are countless ways in which machine learning models have been trained on biased data, which can produce biased results that can lead financial institutions down the wrong path.
It’s important for enterprises to understand the power and risks of AI bias. A biased AI-based system can lead businesses to yield skewed predictions. The threat is that financial institutions using detrimental models or data that exposes race or gender bias into a lending decision may not even be conscious that they’re doing that. Some information, such as names and gender, could be proxies for categorizing and identifying applicants in illegal ways. Even if the bias is unintentional, it still puts the organization at risk by not complying with regulatory requirements and could lead to certain groups of people being unfairly denied loans or lines of credit.
AI-based systems are often used to help businesses forecast customer demand so they can have appropriate service offers for the targeted audiences. But biases can throw off such equations, leaving financial institutions with inaccurate in–demand financial services and products.
The biggest barrier to the success of AI-based systems is AI adoption. The biggest obstacle to AI adoption is lack of trust. One of most frequent causes for AI poor results is bias. The time it takes to incorporate fair and unbiased machine learning model insights into decision-making systems is directly proportionate to realizing returns on AI-related investments.
Currently, organizations don’t have the pieces in place to successfully mitigate bias in AI systems. But with AI increasingly being deployed within and across businesses to inform decisions affecting people’s lives, it’s vital that organizations strive to reduce bias, not just for moral reasons, but to comply with regulatory requirements and build revenue.
“Fairness-Aware” Culture and Implementation
To have the most beneficial outcomes, look for solutions focused on fairness-aware design and implementation (both at the level of nontechnical self-assessment and at the level of technical controls and means of evaluation.) Providers should have an analytical culture that considers responsible data acquisition, handling, and management as necessary components of algorithmic fairness, because if the results of an AI project are generated by biased, compromised, or skewed datasets, affected parties will not adequately be protected from discriminatory harm. These are the elements of data fairness that data science teams must keep in mind:
- Representativeness: Depending on the context, either underrepresentation or overrepresentation of disadvantaged or legally protected groups in the data sample may lead to the systematic disadvantaging the vulnerable parties in the outcomes of the trained model. To avoid such kinds of sampling bias, domain expertise will be crucial to assess the fit between the data collected or acquired and the underlying population to be modeled. Technical team members should offer means of remediation to correct for representational flaws in the sampling.
- Fit-for-Purpose and Sufficiency: An important question to consider in the data collection and acquirement process is: Will the amount of data collected be sufficient for the intended purpose of the project? The quantity of data collected or acquired has a significant impact on the accuracy and reasonableness of the outputs of a trained model. A data sample not large enough to represent with sufficient richness. as the significant or qualifying attributes of the members of a population to be classified might be so small that it may lead to unfair outcomes. Insufficient datasets may not equitably reflect the qualities that should be weighed to produce a justified outcome that is consistent with the desired purpose of the AI system. Accordingly, members of the project team with technical and policy competencies should collaborate to determine if the data quantity is sufficient and fit-for-purpose.
- Source Integrity and Measurement Accuracy: Effective bias mitigation begins at the very beginning of data extraction and collection processes. Both the sources and tools of measurement may introduce discriminatory factors into a dataset. When incorporated as inputs in the training data, biased prior human decisions, and judgments—such as prejudiced scoring, ranking, interview-data or evaluation—will become the ‘ground truth’ of the model and replicate the bias in the outputs of the system. To secure discriminatory non-harm, the data sample must have an optimal source integrity. This involves securing or confirming that the data gathering processes involved suitable, reliable, and impartial sources of measurement and robust methods of collection.
- Timeliness and Recency: If the datasets include outdated data, then changes in the underlying data distribution may adversely affect the generalizability of the trained model. Provided these distributional drifts reflect changing social relationship or group dynamics, this loss of accuracy regarding actual characteristics of the underlying population may introduce bias into the AI system. In preventing discriminatory outcomes, timeliness, and recency of all elements of the dataset should be scrutinized.
- Relevance, Appropriateness and Domain Knowledge: The understanding and use of the most appropriate sources and types of data are crucial for building a robust and unbiased AI system. Solid domain knowledge of the underlying population distribution, and of the predictive goal of the project, is instrumental for selecting optimally relevant measurement inputs that contribute to the reasonable resolution of the defined solution. Domain experts should collaborate closely with data science teams to assist in determining optimally appropriate categories and sources of measurement.
While AI-based systems assist in decision-making automation processes deliver cost savings, financial institutions considering AI as a solution must be vigilant to ensure biased decisions are not taking place. The financial institution’s compliance leaders should be in lockstep with their data science team to confirm that AI capabilities are responsible, effective, and free of bias. Having a strategy that champions responsible AI is the right thing to do, and it may also provide a path to compliance with future AI regulations.
To read more about how NICE Actimize uses AI to fight and detect fraud, visit our Enterprise Fraud Management webpage.