Study Options
/
Course Listings
/
Descriptions
MSIT 423: Data Science for Business Intelligence

Description

In the rapidly changing business environment, with global competition and maturing markets, competitive advantage is extremely important. Business can exploit the wealth of massive amounts of data being collected through operational processes as well as from external sources. This course introduces techniques for data mining and its use in various business applications to enable business intelligence. The course uses both hands-on experience using state-of-the-art data mining tools to model business problems and discover interesting patterns for decision support as well as several cases that discuss strategies, outcomes and impact on organizations when using data mining.

REQUIRED TEXT: James, Witten, Hastie and Tibshirani, An Introduction to Statistical Learning, Springer, 2013.

REFERENCE TEXT:

Leskovec, Rajaraman, and Ullman, Mining of Massive Datasets
(Supplementary reading) Hastie, Tibshirani and Friedman, The Elements of Statistical Learning: Data mining, inference and prediction, Springer.

COURSE GOALS: Students will understand and manage the entire process of using data to make better business decisions: extraction, cleaning, understanding, modeling, and presenting. Students will also understand the limitations of data.

DETAILED COURSE TOPICS:

Week 1: Introduction to Predictive Analytics

Course introduction
Simple linear regression
Multiple linear regression, interpretation, and basic inference
Readings: JWHT, sections 3.1, 3.2, 3.6.1-3.6.3

Week 2: Model Accounting and Multicollinearity

Extra and partial sums of squares, R-squared
Newfood and Quality Control cases
Multicollinearity
Quality control case
Residual, QQ and influence plots
Readings: JWHT, section 3.3.3.6

Week 3: Diagnostics and Transformations

Transformations, the multiplicative model, polynomials
Business failure and purifier cases
Readings: JWHT 3.3.3.1-3.3.3.5

Week 4: Categorical Predictor Variables, Interactions and Logistic Regression

Dummy variables
Interactions
Logistic regression
Readings: JWHT 3.6.4, 3.6.6, 4.1-4.3 (skip discriminant analysis)

Week 5: Model Evaluation, Selection and Regularization

Confusion tables, ROC curves, AUC
Penalized measures of fit
Test sets and k-fold cross validation
Variable subset selection
Ridge regression and the lasso
Readings: JWHT 5.1, 5.3.1, 5.3.3; 6.1, 6.2, 6.5, 6.6

Week 6: Midterm and Smoothing

In-class midterm, 80 minutes, covers chapters 3 and 4 (not 5 and 6)
Bin smoothers, k-nearest neighbors
Step functions, piecewise linear models and cubic splines
Readings: JWHT Ch. 7.1-6

Week 7: GAMS and Trees

Generalized additive models
CART
Readings: JWHT sections 7.7, 8.1, 8.3.1, 8.3.2

Week 8: Bagging, Random Forests, Principal Components

Bagging and random forests
Stumps, shrubs, boosted trees as time permits
Principal component analysis
JWHT sections 8.2, 8.3.3, 10.1-2, 10.4

Week 9: Clustering and Recommendation Systems

K-means and hierarchical clustering
Distance metrics
Overview of recommendation systems: popularity, user-based, item-based, SVD as time permits
JWHT sections 10.3, 10.5; Ekstrand chapter on Canvas

Week 10: Project Presentations

Week 11: Final Exam Due

HOMEWORK ASSIGNMENTS: There will be weekly recommended homework problems (with answers). You will have an in-class midterm and a take-home final that must be completed individually.

GRADES:

Homework: 20%
Midterm: 25%
Project: 20%
Final: 35%

COURSE OBJECTIVES: As a result of this course, students will be able to:

1. Identify data-collection biases;
2. Design effective graphics presentations of data;
3. Estimate and interpret classical and data mining models using the R software package;
4. Draw conclusions about causal relationships and recommend actions that should be taken based on an analysis;

Faculty Profile

Yuri Balasanov, PhD

Study Options
/
Course Listings
/
Descriptions
MSIT 423: Data Science for Business Intelligence

Description

Faculty Profile

Request Info

Request Your Program & Application Guide

Study Options / Course Listings / DescriptionsMSIT 423: Data Science for Business Intelligence

Description

Faculty Profile

Request Info

Request Your Program & Application Guide

Study Options
/
Course Listings
/
Descriptions
MSIT 423: Data Science for Business Intelligence