Curriculum
  /  
Descriptions
COMP_SCI 396: Differential Privacy: from Foundations to Machine Learning


VIEW ALL COURSE TIMES AND SESSIONS

Description

The collection and usage of today’s data have posed unprecedented privacy concerns. What exactly are these concerns? Should we make the data private, or should we make private the algorithms that process data?

In this course, we will start with several historical back-and-forths between defenses and attacks of data privacy, which eventually led to the rigorous approach that is popular today -- differential privacy. We will then move on to the measurement of privacy and useful algorithmic tools that accomplish data analysis and machine learning tasks privately. Alongside this, we will learn about some beautiful mathematical tools useful in both private and non-private learning and statistics (some in a surprising manner).

The course is designed to serve at least two purposes:
1. If you're interested in the modern approach to data privacy (in general or specifically differential privacy), this course is definitely for you.
2. If you want to get ready for the study of "the science" of machine learning, this course can be your entry point for the mathematical tools.

(Up to changes) Grades will be determined by
- participation/quizzes
- assignments (mathematical proofs and python notebooks)
- a potential 30-min midterm
- final project (summary of research paper/tech report, or experiments)

PRE-REQUISITES:

  • CS 212 (Discrete Math) or equivalent mathematical maturity that you could have gotten from any one of the following, if you are not from CS:
  • Math 240/GEN_ENG 205/206 (Linear Algebra)
  • or MMSS 300
  • or IEMS 201/303/313
  • or STAT 210
  • or ELEC_ENG 302
  • or MATH 310
  • or MATH 368
  • or COMP_SCI grad standing
  • or instructor permission

The course will be a proof-based theoretical course and assume sufficient mathematical maturity, in particular in probability and linear algebra. Important concepts will be reviewed. A good understanding of CS 212 (Mathematical Foundations of Computer Science) should be sufficient. We don't require a lot of calculus.

Tentative list of topics:
  • Attempts at data privacy and some attacks
  • Introduction to differential privacy
  • Hypothesis testing
  • Building block algorithms
  • Some machine learning background, DP-ERM
  • Everything as exponential mechanisms
  • Isoperimetric inequality
  • Stability implies generalization
  • Adaptive data analysis

REQUIRED TEXTBOOK: None
REFERENCE TEXTBOOKS:
None

COURSE COORDINATORS: Jinshuo Dong

COURSE INSTRUCTOR: Jinshuo Dong