Unless otherwise noted, all homework and projects are due by the beginning of class on the day listed.

In case you are curious, this page is set up to remember what checkboxes you click provided you access this page from the same browser and computer (and don't clear your browser cache). So, check off assignments as you do them to keep track of what you've done!

Wk. Dates Topic Readings Projects due/Exams
1 Jan. 23 &25 Overview Han Ch. 1 & 2
McKinney Ch. 1
CodeAcademy: Python
2 Jan. 30 & Feb. 1 NumPy, Pandas McKinney Ch. 4 & 5
3 Feb. 6 & 8 Reading/wrangling data Han Ch. 2 & 3
McKinney Ch. 6 & 7
4 Feb. 13 & 15 Wrangling/visualizing data McKinney Ch. 8 (Meeting in LSB230 both days)
Feb. 20 No class—Presidents day
5 Feb. 22 Exam 1 on Wed. 2/22
6 Feb. 27 & Mar. 1 Data aggregation McKinney Ch. 9
7 Mar. 6 & 8 Advanced data processing DOT graph description language
Install nxpd
Mar. 11–19 No classes—Spring break
8 Mar. 20 & 22 Pattern mining Han Ch. 6
Project 1 (Wed. 3/22)
9 Mar. 27 & 29 Cluster analysis, scikit-learn Han Ch. 10
Intro to scikit-learn
scikit-learn clustering
Exam 2 on Wed. 3/29
10 Apr. 3 & 5 Intro to machine learning Han Ch. 8
What is ML? (notes)
Iris dataset (notes)
Training models (notes)
Project 1 (Wed. 4/5)
11 Apr. 10 & 12 Data splits, regression Train/test splits (notes)
Cross-validation (notes)
Linear regression (notes)
Apr. 17 No class—Patriots day
12 Apr. 19 Evaluation, feature/parameter selection Grid search (notes)
Metrics (notes)
13 Apr. 24 & 26 More classification Han Ch. 10 (you should have already read this chapter)
14 May. 1 & 3 Review Exam 3 on Wed. 5/3
May 10 Project presentations in LSB 312, 10:15am–12:15pm Project 2 due
(Back to top)



Most Mondays will begin with a short quiz based on the homework and material from the previous week. Paper notes may be used, but no electronic resources. Not all quizzes will necessarily be graded.


Each week there will be a homework based on readings and material from class the previous week. Homeworks are meant to help you ensure you understand the material. Multiple choice homework questions on Canvas will be corrected automatically, but other material submitted (usually extensions of what we worked on in class) will be graded based on apparent effort. I will provide solutions when possible. Homeworks are usually due before the start of class each Monday (see the schedule for due dates).


There will be two large projects due during the class. The first project will consider data exploration, which roughly corresponds to the first half of the semester. The second project is a continuation of the first, and will involve training and evaluating predictive models over the data set. More information can be found on the project assignment pages, listed below.

Late policy

As outlined in the syllabus, no late work will be accepted. Homework and projects must be submitted on time. This gives us the chance to discuss assignments in class right after their deadlines.

(Back to top)


(Back to top)