ViTAL Lab Computational Behavior and Health Analytics

BMI 534 - Introduction to Machine Learning

CS 534 - Machine Learning

The course is cross-listed in both BMI534 and CS534, so please register for the one with available seats.

Instructor:

Hyeokhyen Kwon, Ph.D.
Assistant Professor
Department of Biomedical Informatics
Office: Rm 4105, 4th Floor, Emory Woodruff Memorial Research Building (101 Woodruff Cir, Atlanta, GA 30322)

Teaching Assistant: Seyedeh Somayyeh Mousavi (mail)

Course Overview

Machine learning is innovating many applications all across our society from autonomous vehicles to biomedicine and science. In this course, students will learn the fundamental theories (optimization, probability, linear algebra, etc) and algorithms of machine learning (supervised and unsupervised learning, etc) and also obtain practical experiences in applying machine learning techniques and analysis in real-world problems in biomedical informatics.

Learning Objectives

This course will introduce students to fundamental theory and algorithms in machine learning through lectures, homework, midterm, and a semester-long project. Taking this course, students should be able to:

Prerequisites:

Course Logistics

Communication & Course Materials:

Textbook(s):

Expectations & Grading:

The final grade will be determined by a weighted average of all the graded items.

Component Weight
Participation 10%
Homeworks 35%
Midterm 15%
Project 40%

Final grades may be curved up so that the class mean falls at least in a B range. The class median, mean, and standard deviation will be announced for each assignment and exam so that you have an idea of where you stand.

University Policies and Academic Integrity

Any suspected violations of course rules or the Emory’s Honor Codes will be referred to the honor council for a hearing.
This includes but is not limited to consulting electronic or printed materials during midterm and plagiarism on homework or class projects.
It is your responsibility to understand the Laney Graduate School Honor Code, the Emory College Honor Code, and the Department Statement of Policy on Computer Assignments.

(Tentative) Course Schedule

Topics may change but the homework, midterm, and project deliverables are fixed. The reading material listed below is optional and the lecture plan may deviate over the course of the semester.

# Date Theme Topic Reference (Chapter) Assignment
1 1/17 Intro + Course Logistics Review syllabus, Overview of course topics Ch. 1 (Hastie et al.)
Ch. 1 (Murphy)
Ch. 3 (Welling)
Homework #0 out (Due 1/30)
2 1/22 Intro to Optimization   Convex optimization notes Part I and II from Stanford’s machine learning class
Rosenberg’s abridged notes
 
3 1/24 Intro to Statistics, Probability, and Random Variables Random variables, probability density functions, conditional and joint distributions, Bayes rule Handouts  
4 1/29 Statistical Decision Theory + Linear Regression Mapping machine learning problems to statistical concepts, Regression, ridge regression Ch 1 -2; Ch 3.1 - 3.4 (Hastie et al.)
Ch. 17.1 - 17.2 (Barber)
Prof. Carlos Carvalho’s MLR Slides
 
5 1/31 Linear Regression + Naive Bayes LASSO regression, elastic net regression   Homework #1 out (Due 2/13)
6 2/5 Linear Classification logistic regression, LDA, QDA Ch 2.1 - 2.4; Ch 4.1 - 4.4 (Hastie et al.)  
7 2/7 Linear Classification + Bias-Variance Tradeoff Training & test error, conditional and expected test error, bias-variance decomposition and tradeoff, training error optimism Ch 7.2 - 7.3 (Hastie et al.)
Ch. 5.9 (Daumé III)
 
8 2/12 Model Assessment + Error Measures Validation as an estimation problem, cross validation, bias and variance of cross validation schemes, Error measures, class imbalance, ROC analysis, precision-recall Ch. 7.10 (Hastie et al.)
Ch. 2.5 - 2.6 (Daumé III)
 
9 2/14 Model Selection Effective number of parameters, Akaike and Bayes information criterion Ch. 7 (Hastie et al.)
Ch. 5.5 - 5.6 (Daumé III)
Homework #2 out (Due 2/27)
10 2/19 Practical Issues Preparing data, labeling issues, interpretation Ch. 9 -10 (Hastie et al.)  
11 2/21 Decision Trees Decision trees, boosting Ch. 9.2 (Hastie et al.)
Ch. 1.3 (Daumé III)
 
12 2/26 Perceptron + Support Vector Machines Perceptron, SVM, kernel SVM Ch. 12 (Hastie et al.)
Ch. 4; Ch. 11 (Daumé III)
Ch. 7 - 9 (Welling)
Ch. 15 (Shalev-Shwartz & Ben-David)
Standford SVM notes
NYU SVM notes
 
13 2/28 Neural Networks Architectures, gradient optimization, back propagation Ch. 11 (Hastie et al.)
Ch. 1-3 (Nielsen)
Ch. 20.1 - 20.3 (Shalev-Shwartz & Ben-David)
Homework #3 out (Due 3/14)
14 3/4 Neural Networks     Project Proposal due 3/5
  3/6 Spring Break      
  3/11 Spring Break      
15 3/13 Additive Models + Bootstrap ADABoost, gradient boosting Ch. 7.11; Ch. 9.1 (Hastie et al.)  
16 3/18 Boosting   Ch. 10 (Hastie et al.) Homework #4 out (Due 4/2)
17 3/20 Random Forest Ensemble methods, random forests Ch. 15 - 16 (Hastie et al.)
Breiman’s paper
Project Spotlight Slides Due 3/24
18 3/25 Project Spotlight + Ensembles      
19 3/27 Prototype methods + Challenges with High-dimensional Data + Demensionality Reduction KNN, Curse of dimensionality, sparse representation Ch. 13 - 14; Ch. 18 (Hastie et al.)
Ch. 3.2 - 3.3 (Daumé III)
Ch. 5 (Welling)
Ch. 19.1 - 19.2; Ch. 23 (Shalev-Shwartz & Ben-David)
Stanford PCA notes
 
20 4/1 Dimensionality Reduction Principal component analysis, locally-linear embedding, manifold learning Ch. 14 (Hastie et al.)  
21 4/3 Clustering + Mixture modeling K-means, spectral clustering, expectation maximization Ch. 14 (Hastie et al.) Homework #5 out (Due 4/16)
22 4/8 Reinforcement Learning Markov Decision Process    
23 4/10 Reinforcement Learning Q-Learning    
24 4/15 Bayesian Network Probabilistic Graphical Model    
25 4/17 Filtering + Time-series Analysis Kalman Filter, Hidden Markov Model    
21 4/22 Midterm Exam      
27 4/24 Ethics in AI      
28 4/29 Project Presentations     Final Report Due 5/10