Researchers Investigating Popular Predictor Tool, Working to Mitigate Bias in Data

Researchers with the Northwestern Center for Advancing Safety of Machine Intelligence (CASMI) are working to understand what is causing bias in a popular predictor tool that is quick to train.

Romila Pradhan, assistant professor in computer and information technology at Purdue University, is the principal investigator of the CASMI-funded project, “Diagnosing, Understanding, and Fixing Data Biases for Trusted Data Science.” Her research team is investigating data from multiple domains (such as finance and criminal justice) to find the source of discrimination that is observed in a commonly used tree-based machine learning (ML) model called the random forest classifier.

“What we observe is that on the unseen data, we might have some bias,” Pradhan said. “We had a model that worked fairly well in training, but it did not perform as well on test data. It may be favoring one gender over another or one race over another. I want to trace all of this bias back to different tiny steps in the machine learning process and eventually to the training data.”

View media coverage of our news story at the following link:

McCormick News Article