ViTAL Lab Computational Behavior and Health Analytics

BMI 534 - Introduction to Machine Learning

Course schedule: Monday and Wednesday 2:30 - 3:45pm

Classroom: Rm 4004, Woodruff Memorial Resesarch Building

Instructor:

Hyeokhyen Kwon, Ph.D.
Assistant Professor
Department of Biomedical Informatics
Office: Rm 4105, 4th Floor, Emory Woodruff Memorial Research Building (101 Woodruff Cir, Atlanta, GA 30322)

Teaching Assistant: TBD (mail)

Course Overview

Machine learning is innovating many applications all across our society from autonomous vehicles to biomedicine and science. In this course, students will learn the fundamental theories (optimization, probability, linear algebra, etc) and algorithms of machine learning (supervised and unsupervised learning, etc) and also obtain practical experiences in applying machine learning techniques and analysis in real-world problems in biomedical informatics.

Learning Objectives

This course will introduce students to fundamental theory and algorithms in machine learning through lectures, homework, and a semester-long project. Taking this course, students should be able to:

Prerequisites:

Course Logistics

Communication & Course Materials:

Textbook(s):

Expectations & Grading:

The final grade will be determined by a weighted average of all the graded items.

Component Weight
Participation 10%
Homeworks 50%
Project 40%

Final grades may be curved up so that the class mean falls at least in a B range. The class median, mean, and standard deviation will be announced for each assignment and exam so that you have an idea of where you stand.

University Policies and Academic Integrity

Any suspected violations of course rules or the Emory’s Honor Codes will be referred to the honor council for a hearing.
This includes but is not limited to consulting electronic or printed materials during midterm and plagiarism on homework or class projects.
It is your responsibility to understand the Laney Graduate School Honor Code, the Emory College Honor Code, and the Department Statement of Policy on Computer Assignments.

(Tentative) Course Schedule

Topics may change but the homework, midterm, and project deliverables are fixed. The reading material listed below is optional and the lecture plan may deviate over the course of the semester.

# Date Theme Topic Reference (Chapter) Assignment
1 8/27 Intro + Course Logistics Review syllabus, Overview of course topics Ch. 1 (Hastie et al.)
Ch. 1 (Murphy)
Ch. 3 (Welling)
Homework #0 out (Due 9/9)
  9/1 Labor Day      
2 9/3 Intro to Optimization   Convex optimization notes Part I and II from Stanford’s machine learning class
Rosenberg’s abridged notes
 
3 9/8 Intro to Statistics, Probability, and Random Variables Random variables, probability density functions, conditional and joint distributions, Bayes rule Handouts  
4 9/10 Statistical Decision Theory + Linear Regression Mapping machine learning problems to statistical concepts, Regression, ridge regression Ch 1 -2; Ch 3.1 - 3.4 (Hastie et al.)
Ch. 17.1 - 17.2 (Barber)
Prof. Carlos Carvalho’s MLR Slides
Homework #1 out (Due 9/23)
5 9/15 Linear Regression + Naive Bayes LASSO regression, elastic net regression    
6 9/17 Linear Classification logistic regression, LDA, QDA Ch 2.1 - 2.4; Ch 4.1 - 4.4 (Hastie et al.)  
7 9/22 Linear Classification + Bias-Variance Tradeoff Training & test error, conditional and expected test error, bias-variance decomposition and tradeoff, training error optimism Ch 7.2 - 7.3 (Hastie et al.)
Ch. 5.9 (Daumé III)
 
8 9/24 Model Assessment + Error Measures Validation as an estimation problem, cross validation, bias and variance of cross validation schemes, Error measures, class imbalance, ROC analysis, precision-recall Ch. 7.10 (Hastie et al.)
Ch. 2.5 - 2.6 (Daumé III)
Homework #2 out (Due 10/7)
9 9/29 Model Selection Effective number of parameters, Akaike and Bayes information criterion Ch. 7 (Hastie et al.)
Ch. 5.5 - 5.6 (Daumé III)
 
10 10/1 Practical Issues Preparing data, labeling issues, interpretation Ch. 9 -10 (Hastie et al.) Project Proposal due 10/1
11 10/6 Decision Trees Decision trees, boosting Ch. 9.2 (Hastie et al.)
Ch. 1.3 (Daumé III)
 
12 10/8 Perceptron + Support Vector Machines Perceptron, SVM, kernel SVM Ch. 12 (Hastie et al.)
Ch. 4; Ch. 11 (Daumé III)
Ch. 7 - 9 (Welling)
Ch. 15 (Shalev-Shwartz & Ben-David)
Standford SVM notes
NYU SVM notes
Homework #3 out (Due 10/21)
  10/13 Fall Break      
13 10/15 Neural Networks Architectures, gradient optimization, back propagation Ch. 11 (Hastie et al.)
Ch. 1-3 (Nielsen)
Ch. 20.1 - 20.3 (Shalev-Shwartz & Ben-David)
Project Spotlight Slides Due 10/19
14 10/20 Project Spotlight + Neural Networks      
15 10/22 Additive Models + Bootstrap ADABoost, gradient boosting Ch. 7.11; Ch. 9.1 (Hastie et al.) Homework #4 out (Due 11/4)
16 10/27 Boosting   Ch. 10 (Hastie et al.)  
17 10/29 Random Forest Ensemble methods, random forests Ch. 15 - 16 (Hastie et al.)
Breiman’s paper
 
18 11/3 Ensembles      
19 11/5 Prototype methods + Challenges with High-dimensional Data + Demensionality Reduction KNN, Curse of dimensionality, sparse representation Ch. 13 - 14; Ch. 18 (Hastie et al.)
Ch. 3.2 - 3.3 (Daumé III)
Ch. 5 (Welling)
Ch. 19.1 - 19.2; Ch. 23 (Shalev-Shwartz & Ben-David)
Stanford PCA notes
 
20 11/10 Dimensionality Reduction Principal component analysis, locally-linear embedding, manifold learning Ch. 14 (Hastie et al.)  
21 11/12 Clustering + Mixture modeling K-means, spectral clustering, expectation maximization Ch. 14 (Hastie et al.)  
22 11/17 Reinforcement Learning Markov Decision Process    
23 11/19 Reinforcement Learning Q-Learning    
24 11/24 Bayesian Network Probabilistic Graphical Model    
25 11/26 Filtering + Time-series Analysis Kalman Filter, Hidden Markov Model    
21 12/1 TBD      
27 12/3 Ethics in AI      
28 12/8 Project Presentations     Final Report Due by exam day assigned by school