COMP_SCI 496: Foundations of Reliable Machine Learning



CS 336 or PhDs Students or Instructor permission


Today’s data pose unprecedented challenges for machine learning,  statistics and data analysis. It may be incomplete, noisy, or exposed to some unknown source of corruption. Moreover, recent phenomenon of (imperceptible) adversarial perturbations have also demonstrated that robustness is important for reliable and secure adoption of ML systems in practice. 

In this course, we will learn about different theoretical frameworks to model and reason about various kinds of noise, corruptions and errors in machine learning and statistical tasks. We will also learn about different algorithmic techniques that are both efficient and robust, in the context of both supervised and unsupervised learning. Finally, we will also explore new topics like adversarial robustness and relations to topics like differential privacy.

**Pre-requisites:** The course will be a proof-based theoretical course and assume sufficient mathematical sophistication and familiarity with topics like probability, linear algebra, and some optimization. The course is targeted at graduate students and advanced undergraduate students who are interested in some of these topics. The student is expected to have taken an undergrad theory course like COMP_SCI 336 (Algorithms) in computer science, or equivalent courses in other disciplines.  

**Tentative list of topics:**

  1. Introduction, Robustness and preliminaries. 
  2. Random noise, SQ learning, and noise sensitivity. 
  3. Huber Contamination model, influence functions, breakdown point.  
  4. Filtering: high-dimensional mean estimation and regression, convex relaxation based-methods. 
  5. Robustness in Supervised Learning: Malicious noise, Agnostic learning, and Massart noise.   
  6. Semi-random models. 
  7. Adversarial robustness, certified robustness through randomized smoothing. 
  8. Adversarial perturbations, Distributional robustness, new directions. 
  9. Other notions of robustness like missing data, list-decodable learning, truncated statistics.
  10. Private machine learning. 

INSTRUCTOR: Prof. Aravindan Vijayaraghavan