MSIA 431: Analytics for Big Data

Quarter Offered

Spring ; Diego Klabjan


This course covers a variety of topics and tools in big data. Most of the topics are about the Hadoop ecosystem, however select databases outside of Hadoop will also be considered.

The course is not about how to install and maintain Hadoop, but how to use it. Prior knowledge of Java, SQL, optimization, predictive analytics and data mining, and database modeling is required.

Course Objectives:

At the end of this course, you should be able to:

1. Understand distributed computing and databases

2. Implement sophisticated algorithms in MapReduce

3. Use Pig and Hive

4. Use NoSQL databases Aster and Hbase

5. Learn the basics of the other contemporary tools (Apache, Storm, Mahout, Spark)