Preventing Bias In Predictive Analytics

At first thought, predictive analytics engines seem like an ideal way to remove human bias from decision-making. After all, these models draw conclusions from data, not stereotypes, so they should be objective in theory. While this seems reasonable at first, researchers discovered that predictive analytics could indeed carry human biases and amplify them.

Perhaps the most famous example of AI bias is Amazon’s failed recruitment algorithm. Developers found that the model taught itself to prefer male candidates since they trained it mostly on men’s resumes. Implicit biases that humans may not recognize within themselves can transfer to the algorithms they program.

As companies start to use predictive analytics in areas like creditworthiness and health care, AI bias becomes a more pressing issue. Developers and data scientists must learn to eliminate discrimination in these models.

Identifying Sources of Bias

The first step in preventing bias in predictive analytics is recognizing where it can come from. The most obvious source is misleading data, like in Amazon’s case, which made it seem like top candidates were most often men. Data from misrepresentative samples or statistics that don’t account for historical nuances will cultivate discrimination in an algorithm as they do in humans.

Developers can unintentionally generate bias in their algorithms by framing questions the wrong way. For example, one health care algorithm showed discrimination against Black patients in determining care as a matter of cost. Focusing on cost trends led it to believe Black people were less in need since they have historically spent less on medical services.

Framing the issue this way fails to account for the years of restricted access to health care that cause these cost-related trends. In this instance, the data itself was not biased, but the way the algorithm analyzed it didn’t account for it.

When developers understand where bias comes from, they can plan to avoid it. They can look for more representative data and ask more inclusive questions to produce fairer results.

Taking an Anti-Bias Approach to Development

As teams start to train a predictive analytics model, they need to take an anti-bias approach. It’s not enough to be unbiased. Instead, developers should consciously look for and address discrimination. Proactive measures will prevent implicit prejudices from going unnoticed.

One of the most critical steps in this process is maintaining diversity among the team. Collaborating with various people can compensate for blind spots that more uniform groups may have. Bringing in employees with diverse backgrounds and experiences can help highlight potentially problematic data sets or outcomes.

In some instances, teams can remove all protected variables like race and gender from data before training the algorithm. Scrubbing to free it of bias before training instead of addressing concerns later can ensure fairer results from the beginning. When demographic information isn’t even a factor, algorithms won’t learn to draw misleading conclusions from it.

Reviewing and Testing Analytics Models

After producing a predictive analytics engine, teams should continue to test and review it before implementation. Technicians and analysts should be skeptical, asking questions whenever something out of the ordinary arises. When an algorithm produces a result, they should ask “why” and look into how it came to that conclusion.

Teams should always test algorithms with dummy data representing real-life situations. The closer these resemble the real world, the easier it will be to spot any potential biases. Using diverse datasets in this process will help reveal a broader spectrum of potential issues.

As mentioned earlier, removing protected variables can help in some instances. In some situations, though, it’s better to use this information to reveal and correct biases. Teams can use their algorithm to measure bias within themselves and then offset it.