Applying Machine Learning Methods in High-Stakes Settings

A Q&A with Northwestern Computer Science assistant professor of instruction Zach Wood-Doughty

Pursuing computer science (CS) was a no-brainer for Zach Wood-Doughty. A third-generation computer science professor following in his father and grandfather’s footsteps, he was hooked at a very early age.

Zach Wood-DoughtyWood-Doughty’s undergraduate experience at Carleton College in Northfield, Minnesota — where he earned bachelor’s degrees in computer science and mathematics in 2014 — left a strong impression. He valued the classroom setting and the opportunity to learn new things in a broad manner. This ultimately inspired him to pursue a teaching academy certificate and other teaching preparation training during graduate school.

After earning a PhD and MSE in computer science from Johns Hopkins University, Wood-Doughty joined Northwestern CS in 2021, simultaneously accepting positions both as a McCormick Teaching Fellow and a member of the teaching-track faculty.

During his one-year postdoc stint, Wood-Doughty instructed the COMP_SCI 349: Machine Learning course while continuing his research, which is focused on using natural language processing methods in causal inference, or reasoning about cause and effect in a formal mathematical framework. In winter 2022, he launched the COMP_SCI 396: Modeling Relationships with Causal Inference course, designed to introduce students to the field.

Wood-Doughty officially started his first year as assistant professor of instruction in September 2022.

We asked Wood-Doughty about his experience working with the Northwestern community, his short- and long-term research goals, and his motivation for and approach to teaching.

What excites you about working with the Northwestern community?

The combined focus on undergraduate teaching and visionary research. It’s very rare to have a university that can excel at both.

The fact there are so many different top-tier schools that actually work together is really cool. The CS department is somewhat unique, in that CS works well across disciplinary areas in a way that may be harder for other core STEM fields. We have CS and law, CS and communication, CS and medicine. Those collaborations have been built and are being strengthened every day.

What are some key questions you seek to answer with your work, both in the short-term and the long-term?

The main question that I'm focusing on is: how do we apply the most recent machine learning methods — which are constantly evolving and getting more powerful, but also have limitations such as requiring large data sets and more computation resources — to high-stakes settings like clinical care?

In addition to the extensive documentation and clinical coding of a patient’s medical history in electronic health records (EHR), there is still a ton of information locked up in the clinical notes. For instance, your doctor orders some tests. That order and the results will show up in your chart, but relevant data — including what the doctor was thinking when they ordered those tests, whether they were worried about a possible condition, what conclusions they drew from any inconclusive test results, and whether the concern might be important going forward — may only appear in their notes.

Traditional causal inference research is interested in questions like “should you take aspirin in order to lower your risk of cardiovascular disease?” The US Preventive Services Task Force recently changed their recommendations because, even though it has a health benefit for cardiovascular disease, aspirin causes an offsetting risk of stroke. That's an example of a determination based on large data sets of health outcomes over a 20–30-year period.

Accounting for everything about a patient based on their medical history and doctors’ clinical notes becomes a much more complicated proposition involving the use of a neural network to model the text. A lot of my work focuses on determining what formal guarantees we can get by training models or evaluating them in structured ways such that, even if the models have systematic biases or perform worse for certain subgroups, we can account for that in our overall analysis and still have robust conclusions.

I want to be able to use neural networks to get a better look at longitudinal healthcare and understand the causes and consequences of various outcomes and treatments.

What’s one project you’re currently working on that you’re really excited about?

I’m working on a potential collaboration with immunology faculty at Northwestern University’s Feinberg School of Medicine, focusing specifically on the autoimmune disease lupus. The best way to improve long-term outcomes for lupus is to detect it early, but that can be very difficult because symptoms indicative of lupus are also indicative of many other diseases and disorders.

You can imagine a world in which we have a machine learning classifier that indicates a certain patient is at 20 percent risk of lupus. If we can predict with 80 percent accuracy, that would be amazing, but what are the implications of acting on those predictions to change patient care?

Traditional machine learning stops at the point of prediction, and clinical care starts at the point of prediction, and often those things are completely separated from each other. Most doctors aren't going to outsource a key component of their job. If a model can, on average, predict for some subpopulations better than the doctor, do you trust that for an arbitrary patient? In general, no, you don’t. How do we determine the reliability of these models and engender that kind of trust with clinicians who should be the first and last decision-maker?

I’m also wrapping up a project that I've been working on for quite a while related to one of the challenges with causal inference, which is that you can't validate it with real-world data. For example, we are very confident that smoking causes cancer. But, for any given patient, you can never claim definitively that smoking caused their cancer or that, if this patient had smoked, they would have developed cancer.

To get around this, we can use a synthetic data distribution which encodes a specific causal effect of smoking on cancer. By applying our models to this fake data, we can evaluate different causal models and understand whether they, at least in theory, can provide robust estimates. Could we have done as well using a simpler model? Does our model perform better in specific settings?

My work focuses on extending these synthetic datasets to involve text, as an analogue to clinical notes. We know that in many clinical settings, the text contains crucial information about a patient, but from a mathematical perspective, it’s tricky to define exactly how it conveys that information. We want meaningful fake data where the text is a proxy for other variables in a complex causal relationship. I use language models (similar to ChatGPT) that generate good fake data that can be used to test recently proposed methods for causal inference with text data.

What motivated you to pursue your field of research?

I went to graduate school with an intention to focus on machine learning for noisy text data. I started with Twitter content, but then I also started to think about medical data and clinical notes, which are not so different from social media in terms of how hard it can be for an outsider to understand. There's lots of jargon or slang, and lots of things are typed quickly with perhaps ambiguous interpretations of what's being said.

I had no knowledge of causal inference going into graduate school. I took a class in it my first semester because I wanted a course with some statistical meat to it and thought it would probably be relevant for the computational social science research I was interested in and I just really enjoyed it.

How do you approach the mentorship of your students, and how are you inspired by your interactions with students and trainees?

My machine learning class is very popular — it's a hot topic that is getting a lot of press — and that means that I have students from a variety of different departments who are interested in understanding how it applies in their domain, not just as an academic curiosity, but also as a practical method for their own research. Students studying civil engineering, for instance, are thinking about how to use these methods for understanding transportation. I try to be cognizant of catering to what everyone is looking for in their experience.

I've been trying hard this year to make research opportunities available when I can, and I’m working with some really cool students who have been finding projects on their own.

I've been really impressed with the ability of the students to seek out these opportunities and put in a lot of effort to figure out what they need to know to get the most out of the work.

McCormick News Article