Breaking News

Tackling Bias and Explainability in Automated Machine Learning – TDWI

Tackling Bias and Explainability in Automated Machine Learning

Automated machine learning is likely to introduce two critical problems. Fortunately, vendors are introducing tools to tackle both of them.

By Fern HalperAugust 17, 2020

Adoption of automated machine learning — tools that help data scientists and business analysts (and even business users) automate the construction of machine learning models — is expected to increase over the next few years because these tools simplify model building. For example, in some of the tools, all the user needs to do is specify the outcome or target variable of interest along with the attributes believed to be predictive. The automated machine learning (autoML) platform picks the best model.

These tools offer several benefits. First, they can help data scientists become more productive. Second, autoML can help those who are not data scientists (e.g., modern data analysts) build models. At TDWI, we’ve recommended that organizations that want to use these tools should still have the skills to verify the insights produced. There are a few particular areas that are critical for model builders to address, regardless of their skill level. These include bias and explainability.

Bias comes in many forms. For instance, on the data collection front:

Sample bias occurs when data doesn’t represent the environment (e.g., the problem space) where a model might be deployed
Prejudice bias arises when training data contains information about race, gender, or nationality
Exclusion bias occurs when certain data might be removed from the training set because the data is deemed irrelevant

On the model front, you can find:

Measurement bias can be introduced when the training data differs from production data
Algorithmic bias might occur when a model was trained on data that results in unfair outcomes

Understanding and mitigating bias is crucial because machine learning models often make decisions that affect our lives — in medicine, criminal justice, hiring, and finance.

Explainability involves describing the why behind an ML prediction in a way a human can understand. For example, a customer should be able to understand why his loan application was rejected; a doctor should understand why a system might have made a certain diagnosis. Aside from ethical and transparency factors, new regulations also require explainability. For instance, Article 22 of the GDPR states that users have the right to review automated decisions. That requires that a model used to derive business decisions be understandable — and that means explainable by those who created the model.

Bias, Explainability, and AutoML

According to TDWI research, if users stick to their plans, autoML adoption is expected to grow significantly over the next few years. That means that, theoretically, business analysts and even business users might be using these tools to build models. These models can be operationalized as part of a business process or they might be used to simply provide insights. Regardless, model builders will need to be able to explain the output and how biased data can affect it.

At a minimum, users need to understand the risk of bias in their data set because much of the bias in model building can be human bias. That doesn’t mean just throwing out variables, which, if done incorrectly, can lead to additional issues. Research in bias and explainability has grown in importance recently and tools are starting to reach the market to help. For instance, the AI Fairness 360 (AIF360) project, launched by IBM, provides open source bias mitigation algorithms developed by the research community. These include bias mitigation algorithms to help in the pre-processing, in-processing, and post-processing stages of machine learning. In other words, the algorithms operate over the data to identify and treat bias.

Vendors, including SAS, DataRobot, and, are providing features in their tools that help explain model output. One example is a bar chart that ranks a feature’s impact. That makes it easier to tell what features are important in the model. Vendors such as provide three kinds of output that help with explainability and bias. These include feature importance as well as Shapely partial dependence plots (e.g., how much a feature value contributed to the prediction) and disparate impact analysis. Disparate impact analysis quantitatively measures the adverse treatment of protected classes (e.g., is any class being treated differently by race, age, or gender).

With these features, model builders can examine the analysis and determine if their model is adversely impacting any group. This output can be used to determine whether the model is fair and what next steps need to be taken with the model.

A Final Word

These are tough problems to solve and work is just beginning. The good news is that vendors and end users are both becoming aware of machine learning bias issues. Additionally, they are starting to care about biased model output and take it seriously — and not simply because of legal and compliance issues.

The first step is to become aware and educated about the problem and how to address it. This includes understanding human as well as technology approaches to help mitigate bias.

About the Author

Fern Halper, Ph.D., is well known in the analytics community, having published hundreds of articles, research reports, speeches, webinars, and more on data mining and information technology over the past 20 years. Halper is also co-author of several “Dummies” books on cloud computing, hybrid cloud, and big data. She is VP and senior research director, advanced analytics at TDWI Research, focusing on predictive analytics, social media analysis, text analytics, cloud computing, and “big data” analytics approaches. She has been a partner at industry analyst firm Hurwitz & Associates and a lead analyst for Bell Labs. Her Ph.D. is from Texas A&M University. You can reach her at [email protected], on Twitter @fhalper, and on LinkedIn at