The Science and Art of Predictive Analytics
Professor Ed Malthouse talks about his approach to teaching this vital skill as he takes over as the new instructor of the MSiA program’s introductory class.
The best way to understand the importance of the introductory predictive analytics course in Northwestern Engineering's Master of Science in Analytics (MSiA) program is to look at the consequences when the skills taught are applied incorrectly.
The case studies are plentiful:
- In November 2021, online real estate marketplace Zillow announced it would shut down its Zillow Offers operation and cut a quarter of its workforce – about 2,000 employees. The home-flipping unit had been dragged down by an error rate in the machine learning algorithm used to predict home prices.
- In October 2020, Public Health England, the United Kingdom's governing body responsible for tallying new COVID-19 cases, admitted that nearly 16,000 cases went unreported during seven days because of unpredicted data limitations in Microsoft Excel. The mistake hampered contact tracing efforts.
- In 2019, a study in the journal Science showed a healthcare prediction algorithm used by hospitals and insurance companies to identify patients in the United States in need of "high-risk care management" programs was far less likely to identify Black patients. The algorithm incorrectly used healthcare spending to determine an individual's healthcare needs, and Black patients were more likely to spend less because of income disparities, despite having similar medical needs.
Predictive Analytics I is being taught in the MSiA program for the first time by Professor Ed Malthouse, though he's taught the course's concepts for two decades across other programs at Northwestern. Malthouse said he's bringing real data sets with him for students to use in homework assignments and projects.
"I've run a research center devoted to developing business insights based on big behavioral data," Malthouse said. "What I teach in this class is the core set of models we've used in all of our projects to help real businesses make better decisions."
Malthouse's goal is to give students a solid foundation to build on in future predictive analytics classes so they become highly valuable forecasters for businesses in any industry. That growth can start with what's called linear predictive analysis, which can be as simple as figuring out what the rate of change in a variable has been over a period of time and using that rate to predict what the rate will be going forward.
But there's much more to the field than that, Malthouse said, and it's not all about the math.
I think linear models are the most useful statistical tools, and I use them more than anything else," he said. "But it's a pleasure to teach the art of using them well.
That art requires careful examination of a predictive model's assumptions to better understand whether they will hold going forward. Just because something has grown at a certain percentage doesn't mean it will continue to do so if other market forces are changing. For example, think about the effects of faster-than-expected interest rate increases on home prices
"Students should be able to understand what happens when assumptions are violated, take appropriate actions to fix the model or data when assumptions are violated, and evaluate the model with appropriate metrics," Malthouse said. "It also requires being able to understand what conclusions can be drawn from a model and identify biases and threats to validity."
That makes this course significant to students' futures in the predictive analytics business.
"The class allows decision makers to understand how different factors affect outcomes they care about," he said. "I want them to come away with an appreciation for all the things that parametric models can do for them, while also understanding their limitations."