ViTAL Lab Computational Behavior and Health Analytics

BMI 534 - Introduction to Machine Learning

CS 534 - Machine Learning

The course is cross-listed in both BMI534 and CS534, so please register for the one with available seats.

This is a summarized version of the syllabus. The detailed version of the syllabus will be available on the Course Canvas.

Instructor:

Hyeokhyen Kwon, Ph.D.
Assistant Professor
Department of Biomedical Informatics
Office: Rm 4105, 4th Floor, Emory Woodruff Memorial Research Building (101 Woodruff Cir, Atlanta, GA 30322)

Teaching Assistant: TBD (mail)

Course Overview

Machine learning is innovating many applications all across our society from autonomous vehicles to biomedicine and science. In this course, students will learn the fundamental theories (optimization, probability, linear algebra, etc) and algorithms of machine learning (supervised and unsupervised learning, etc) and also obtain practical experiences in applying machine learning techniques and analysis in real-world problems in biomedical informatics.

Learning Objectives

This course will introduce students to fundamental theory and algorithms in machine learning through lectures, homework, midterm, and a semester-long project. Taking this course, students should be able to:

Prerequisites:

Course Logistics

Communication & Course Materials:

Textbook(s):

Expectations & Grading:

Component Weight
Participation 10%
Homeworks 35%
Midterm 15%
Project 40%

The detailed version of the syllabus with instructions on each component and grading policy will be uploaded on Canvas.

University Policies and Academic Integrity

All class work is governed by the College Honor Code and Departmental Policy. Your submitted homework and all code and writeup must be written by yourself. Any code and writeup that is found to be similar are grounds for an honor code investigation by the Director of Graduate Studies, Laney Graduate School, and the honor council. Additional extensions on assignments will be granted with appropriate documentation from the Office of Undergraduate Education (OUE)

A syllabus with details of policies will be uploaded on Canvas.

(Tentative) Course Schedule

Topics may change but the homework, midterm, and project deliverables are fixed. The reading material listed below is optional and the lecture plan may deviate over the course of the semester.

# Date Theme Topic Reference (Chapter) Assignment
1 1/17 Intro + Course Logistics Review syllabus, Overview of course topics Ch. 1 (Hastie et al.)
Ch. 1 (Murphy)
Ch. 3 (Welling)
Homework #0 out (Due 1/30)
2 1/22 Intro to Optimization   Convex optimization notes Part I and II from Stanford’s machine learning class
Rosenberg’s abridged notes
 
3 1/24 Intro to Statistics, Probability, and Random Variables Random variables, probability density functions, conditional and joint distributions, Bayes rule Handouts  
4 1/29 Statistical Decision Theory + Linear Regression Mapping machine learning problems to statistical concepts, Regression, ridge regression Ch 1 -2; Ch 3.1 - 3.4 (Hastie et al.)
Ch. 17.1 - 17.2 (Barber)
Prof. Carlos Carvalho’s MLR Slides
 
5 1/31 Linear Regression + Naive Bayes LASSO regression, elastic net regression   Homework #1 out (Due 2/13)
6 2/5 Linear Classification logistic regression, LDA, QDA Ch 2.1 - 2.4; Ch 4.1 - 4.4 (Hastie et al.)  
7 2/7 Linear Classification + Bias-Variance Tradeoff Training & test error, conditional and expected test error, bias-variance decomposition and tradeoff, training error optimism Ch 7.2 - 7.3 (Hastie et al.)
Ch. 5.9 (Daumé III)
 
8 2/12 Model Assessment + Error Measures Validation as an estimation problem, cross validation, bias and variance of cross validation schemes, Error measures, class imbalance, ROC analysis, precision-recall Ch. 7.10 (Hastie et al.)
Ch. 2.5 - 2.6 (Daumé III)
 
9 2/14 Model Selection Effective number of parameters, Akaike and Bayes information criterion Ch. 7 (Hastie et al.)
Ch. 5.5 - 5.6 (Daumé III)
Homework #2 out (Due 2/27)
10 2/19 Practical Issues Preparing data, labeling issues, interpretation Ch. 9 -10 (Hastie et al.)  
11 2/21 Decision Trees Decision trees, boosting Ch. 9.2 (Hastie et al.)
Ch. 1.3 (Daumé III)
 
12 2/26 Perceptron + Support Vector Machines Perceptron, SVM, kernel SVM Ch. 12 (Hastie et al.)
Ch. 4; Ch. 11 (Daumé III)
Ch. 7 - 9 (Welling)
Ch. 15 (Shalev-Shwartz & Ben-David)
Standford SVM notes
NYU SVM notes
 
13 2/28 Neural Networks Architectures, gradient optimization, back propagation Ch. 11 (Hastie et al.)
Ch. 1-3 (Nielsen)
Ch. 20.1 - 20.3 (Shalev-Shwartz & Ben-David)
Homework #3 out (Due 3/14)
14 3/4 Neural Networks     Project Proposal due 3/5
  3/6 Spring Break      
  3/11 Spring Break      
15 3/13 Additive Models + Bootstrap ADABoost, gradient boosting Ch. 7.11; Ch. 9.1 (Hastie et al.)  
16 3/18 Boosting   Ch. 10 (Hastie et al.) Homework #4 out (Due 4/2)
17 3/20 Random Forest Ensemble methods, random forests Ch. 15 - 16 (Hastie et al.)
Breiman’s paper
Project Spotlight Slides Due 3/24
18 3/25 Project Spotlight + Ensembles      
19 3/27 Prototype methods + Challenges with High-dimensional Data + Demensionality Reduction KNN, Curse of dimensionality, sparse representation Ch. 13 - 14; Ch. 18 (Hastie et al.)
Ch. 3.2 - 3.3 (Daumé III)
Ch. 5 (Welling)
Ch. 19.1 - 19.2; Ch. 23 (Shalev-Shwartz & Ben-David)
Stanford PCA notes
 
20 4/1 Dimensionality Reduction Principal component analysis, locally-linear embedding, manifold learning Ch. 14 (Hastie et al.)  
21 4/3 Clustering + Mixture modeling K-means, spectral clustering, expectation maximization Ch. 14 (Hastie et al.) Homework #5 out (Due 4/16)
22 4/8 Reinforcement Learning Markov Decision Process    
23 4/10 Reinforcement Learning Q-Learning    
24 4/15 Bayesian Network Probabilistic Graphical Model    
25 4/17 Filtering + Time-series Analysis Kalman Filter, Hidden Markov Model    
21 4/22 Midterm Exam      
27 4/24 Ethics in AI      
28 4/29 Project Presentations     Final Report Due 5/10