# Academics  /  Courses  /  DescriptionsIEMS 304: Statistical Learning for Data Analysis

VIEW ALL COURSE TIMES AND SESSIONS

### Prerequisites

IEMS 303 or equivalent; CS 150 or equivalent

### Description

Predictive modeling of data using modern regression and classification methods. Multiple linear regression; logistic regression; pitfalls and diagnostics; nonparametric and nonlinear regression and classification such as trees, nearest neighbors, neural networks, and ensemble methods.

• This course counts as an IE/OR elective for Industrial Engineering.

LEARNING OBJECTIVES

• Understand common data structures in modern predictive and explanatory modeling problems in business, engineering and the sciences and how to formulate the most appropriate solutions
• Learn R statistical software basics and how to use it for regression and classification problems
• Develop ability to fit appropriate linear and logistic regression models, including model selection and diagnostics
• Develop ability to interpret fitted linear and logistic regression models for explanatory and predictive purposes
• Learn fundamental concepts in nonlinear regression and classification, including maximum likelihood estimation, cross-validation, ridge and lasso shrinkage
• Learn how to fit and interpret popular supervised learning models including trees, smoothers, nearest neighbors, random forests, and boosted trees

TOPICS

• Multiple linear regression basics: model fitting, statistical inference, prediction
• Multiple linear regression: influence, residual diagnostics, multicollinearity, interactions, categorical predictors, variable selection, model evaluation criteria, ridge and lasso regression
• Logistic regression: model fitting and interpretation, statistical inference, diagnostics
• Nonlinear regression basics: maximum likelihood estimation, nonlinear least squares, cross-validation, bootstrapping
• Classification and regression trees
• Nearest neighbors for classification and regression
• Boosted trees and random forests
• R statistical software throughout the course

MATERIALS

Required: An Introduction to Statistical Learning with Applications in R, by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani, Springer. ISBN 978-1-4614-7138-7. Electronic version available free.

Recommended: IEMS 304 Reference Guide