See all NewsEngineering News

Advancing the Theoretical Foundations of Data Science

IDEAL received a five-year, $10 million Phase II award from NSF

Abstract data science image

Accelerating transformative advances in the theoretical foundations of data science, the Institute for Data, Econometrics, Algorithms, and Learning (IDEAL) unites a team of faculty, students, postdoctoral fellows, and researchers to study machine learning and optimization, high-dimensional data analysis and inference, and emerging topics including reliability, interpretability, privacy, and fairness.

The National Science Foundation (NSF) announced that IDEAL is one of two institutes receiving Harnessing the Data Revolution (HDR): Transdisciplinary Research in Principles of Data Science (TRIPODS) Phase II awards. The five-year, $10 million Phase II project combines and builds upon the complementary strengths of two Phase I TRIPODS institutes — IDEAL Phase I (2019-2022) and the University of Illinois at Chicago (UIC) Foundations of Data Science Institute.

Collaborating across disciplines and institutions

More than 60 investigators in computer science, economics, electrical engineering, law, mathematics, operations research, and statistics will collaborate across Northwestern, Google Research, the Illinois Institute of Technology (IIT), the Toyota Technological Institute at Chicago (TTIC), UIC, and the University of Chicago.

Samir Khuller

“Leveraging the power of data and computation to solve grand challenge problems of this century is central to this exciting new project,” said Samir Khuller, Peter and Adrienne Barris Chair of Computer Science and co-principal investigator of IDEAL. “In Phase I, we established a strong and collaborative working relationship with our partners at TTIC and the University of Chicago. The continued support from NSF to establish the larger institute with all of the Chicago-area research universities shows confidence in our progress over the last three years and we hope to be a lot more than the sum of our parts.”

Aravindan VijayaraghavanIDEAL site director Aravindan Vijayaraghavan, associate professor of computer science and (by courtesy) industrial engineering and management sciences at Northwestern Engineering, will lead the Northwestern arm of the Phase II institute.

“We are very grateful to the NSF for making such a large investment in basic research related to data science foundations through the IDEAL institute,” Vijayaraghavan said. “This is a truly interdisciplinary undertaking. Our institute brings together exceptional researchers both in core data science disciplines and in fields outside the traditional scope.”

Northwestern faculty members in IDEAL
(top L to R): Auerbach, Auffinger, Bugni, Canay (bottom L to R): Gaudio, Golub, Guo, Horowitz
Northwestern faculty members in IDEAL
(top L to R): Hullman, Liang, Makarychev, Nocedal (bottom L to R): Linna, Wang, Wei

The Northwestern team also includes:

  • Eric Auerbach, assistant professor of economics at the Weinberg College of Arts and Sciences
  • Antonio (Tuca) Auffinger, professor of mathematics at Weinberg
  • Randall Berry, John A. Dever Chair of Electrical and Computer Engineering
  • Federico Bugni, professor of economics at Weinberg
  • Ivan Canay, HSBC Research Professor of Economics at Weinberg
  • Julia Gaudio, Patrick and Amy McCarter Fellow and research assistant professor of industrial engineering and management sciences and (by courtesy) computer science
  • Ben Golub, associate professor of economics at Weinberg and computer science
  • Dongning Guo, professor of electrical and computer engineering and (by courtesy) computer science
  • Jason Hartline, professor of computer science
  • Joel Horowitz, Charles E. and Emma H. Morrison Professor of Economics at Weinberg
  • Jessica Hullman, Ginni Rometty Professor of Computer Science
  • Annie Liang, assistant professor of economics at Weinberg and the Karr Family Assistant Professor of Computer Science
  • Konstantin Makarychev, professor of computer science
  • Jorge Nocedal, Walter P. Murphy Professor of Industrial Engineering and Management Sciences and (by courtesy) professor of engineering sciences and applied mathematics and director of the Center for Optimization and Statistical Learning
  • Daniel W. Linna Jr., senior lecturer and director of law and technology initiatives
  • Zhaoran Wang, assistant professor of industrial engineering and management sciences and (by courtesy) computer science
  • Ermin Wei, assistant professor of electrical and computer engineering and of industrial engineering and management science

Randall Berry“It is exciting to participate in this project, which brings together Chicago-area researchers in diverse areas, including computer science, economics, electrical engineering, and industrial engineering,” Berry said. “This multidisciplinary expertise is essential to further the foundation of data science.”

IDEAL leadership at partner organizations includes Avrim Blum, professor and chief academic officer at TTIC; Jinqiao (Jeffrey) Duan, professor of applied mathematics at IIT; Chao Gao, assistant professor in statistics at the University of Chicago; and Lev Reyzin, professor of mathematics, statistics, and computer science at UIC.

Expanding participation in data science

Through research, training, and outreach activities, IDEAL aims to broaden local and national participation in data science, build lasting research and educational infrastructure, and bridge the theoretical foundations of data science and the practice of data science in industry and applied science.

“We aim to establish the Chicago-area as a center of excellence in the foundations of data science,” Khuller said. “Along with our unique partnerships with Discovery Partners Institute, P33, and industrial partners such as Google, we hope to engage students at all levels in our research and training efforts. I am truly excited to be part of this collaboration and we look forward to many years of impactful research.”

IDEAL’s five-year plan will lead to new theoretical frameworks, models, mathematical tools, and algorithms for analyzing high-dimensional data, inference, and learning.

“Data science is crucial for decision-making in almost every domain of business and engineering,” said Dean Julio M. Ottino. “The IDEAL partner organizations will both further the study of foundational problems of data science and engage a broader audience with this critical field.”

Building on Phase 1 foundations

The institute’s research focus centers around interdisciplinary topics, including deep learning and optimization, reinforcement learning, data science with strategic agents, trustworthy and reliable data science, and interpretability, privacy, and fairness, which will be the theme of special programs held each fall and winter.

As co-directors of IDEAL Phase 1, Vijayaraghavan and Hartline introduced thematically focused “special quarters,” to foster interdisciplinary, inter-institute collaborative research, such as the spring 2021 Special Quarter on Data Science and Law, co-organized with Linna.

Jason Hartline“The program was incredibly productive,” Hartline said. “We ran a virtual graduate seminar, two ongoing research projects were accepted to the Second ACM Symposium on Computer Science and Law, and an international special interest group continues to hold virtual monthly meetings around a technical program organized by rotating hosts.”

Engaging students, postdocs, and researchers at all career stages, IDEAL Phase 2 will coordinate weekly research meetings, brainstorming sessions, graduate courses, co-mentoring, workshops, and reading groups. IDEAL’s first special program, “Machine Learning and Logic,” will launch in winter 2023.

“We are excited about the research activity that the institute will generate both at Northwestern and in the broader Chicago area, through new research projects and interdisciplinary collaborations, coordinated graduate coursework, short-term research visitors and workshops," Vijayaraghavan said.

Broadening educational pathways

To broaden and strengthen educational pathways, IDEAL will also lead activities targeted to undergraduate students, high school students, and the broader public. IDEAL will connect undergraduate students from underrepresented backgrounds and from institutions lacking resources for research to training opportunities within member institutions. The institute will also launch a program to prepare undergraduates to apply for and thrive in NSF-funded Research Experiences for Undergraduates (REU) sites.

In addition, IDEAL will coordinate undergraduate research workshops, data science workshops for high school science and math teachers, and public events, such as lectures and programming at the Museum of Science and Industry.

"The new 2022 TRIPODS awards address foundational challenges in data science at the core of data-driven discovery and decision making,” said NSF Division Director for Computing and Communication Foundations (CCF) Dilma Da Silva. "CCF is pleased to be able to support these impactful projects."