Spring2018_MachineLearning

COURSE SYLLABUS AND OUTLINE
CSE-5820

MACHINE LEARNING 

 

LECTURE

ROWE 320, MW 10:00am – 11:15am

 

INSTRUCTOR

Jinbo Bi

Phone: 486-1458

Office hour: by appointment

Office: ITEB 233

INSTRUCTOR

Jiangwen Sun

Phone: 486-0510

Office hour: M 11:30am-12:30pm

Office: ITEB 267

TEACHING ASSISTANT

Qianqian Tong

Phone: 486-0510

Office hour: F 3-4pm

Office: ITEB 213

 


PURPOSE AND APPROACH:

The objective of the course is to introduce basic and advanced concepts in machine learning to students, and enable them to use machine learning methods in real-life applications, and review state of the art literature in machine learning. This course covers basic topics of machine learning, such as supervised learning (classification, regression, feature selection etc), unsupervised learning (clustering, dimension reduction or component analysis etc) and some advanced topics, such as scalable machine learning, or deep learning. This course will also attempt if time allows to expose students to emerging topics in the field, such as multi-task learning, multi-view data analysis, structural learning, etc. Usually these topics will be identified by reviewing the latest publications in top venues. Because of the diversity in machine learning topics, the materials covered in this course may vary among semesters.   

The course consists of lectures, paper reviews, quizzes, homeworks and projects. Lectures will serve as the vehicle for the instructor to introduce concepts and knowledge to students. Paper reviews are used to inform students of the latest research topics and techniques. Quizzes are used to test if certain basic concepts have been mastered.  The course may also contain guest lectures from related fields and respective experts. A course project will be used for students to get profound hands-on experience by programming certain machine learning algorithms identified from the recent literature.  Participation in lectures is encouraged during the class. 

Students are encouraged to form study groups to facilitate discussion. Each group is expected to consist of two to three students. During paper review process, each group can select a paper from the recent machine learning venues provided by the instructor, such as International Conference on Machine Learning, or Neural Information Processing Systems, study and present the paper in class. As part of the course, the students will work on a term project where each group can choose to implement the algorithms discussed in the paper they choose to present, or select one of the papers provided by the instructor to study and implement the algorithms in that respective paper. Each team is required to present in the classroom and submit a project report, which describes whether the algorithm could be replicated in their implementation as stated in the original paper, if not, any potential reason why. This exercise will help the team gain much deeper insights into certain algorithms and promote collaborations among team members.


COURSE OBJECTIVES:

  • Master basic concepts of machine learning, such as overfitting, statistical learning theory, different kinds of learning problems
  • Understand some commonly-used machine learning algorithms, such as k-means, logistic regression, support vector machines
  • Get informed of the state of the art in the field
  • Experiment with machine learning methods in solving practical problems

TEXTBOOKS:

  1. Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, ISBN-10: 0321321367 
  2. Pattern Recognition and Machine Learning (Information Science and Statistics) by Christopher M. Bishop, ISBN-10: 0387310738 
  3. Deep Learning (Adaptive Computation and Machine Learning Series) by Ian Goodfellow, Yoshua Bengio, and Aaron Courville, ISBN-10: 0262035618
  4. Pattern Classification (2nd Edition) by Richard O. Duda, Peter E. Hart and David G. Stork, ISBN-10: 0471056693

 

None of the textbooks will be required. However, having one or two of them may complement and expand the materials discussed in lectures.  Lectures will come with slide files and tutorial/review papers for students to study after lectures.


GRADING:

  1. Open-book open-notes quizzes (3 in-class quizzes and 1 final quiz): 40% 
  2. Paper review (1): 10%
  3. Term Project (1): a team can only consist of two to three persons, 30% (implementing selected algorithms from a recent publication)
  4. Programming-involved homework assignment (3): 15% based on performance ranking
  5. Other non-programming homework assignment (2-3): 5%

TENTATIVE SCHEDULE:

 

Week 1 Introduction Understand the differences of supervised and unsupervised learning problems; Introduce a variety of different kinds of learning tasks
Week 2 Review of basic mathematical/statistical skills Basic linear algebra – vector and matrix computation, matrix decomposition, norms; Basic statistics – probability, multivariate distribution, mean and variance
Week 3 Supervised learning – regression Regression methods: least squares, ridge regression, LASSO, gradient descent, probabilistic interpretations of these methods
Week 4 Supervised learning – regression Overfitting, regularization, generalized gradient descent for non-smooth optimization (e.g., LASSO)
Week 5 Supervised learning – latest topics Finish regression methods; New topic: Multi-task learning; In-class (30min) quiz
Week 6 Supervised learning – classification Classification methods: k-nearest neighbor, linear discriminant analysis, classification evaluation metrics
Week 7 Supervised learning – classification Classification methods: logistic regression, support vector machine
Week 8 Supervised learning – latest topics Classification methods: (shallow) neural networks and backpropagation; In-class (30min) quiz
Week 9 Paper review Student teams present papers from the latest ML venues
Week 10 Unsupervised learning – cluster analysis Clustering methods: K-means, hierarchical clustering
Week 11 Unsupervised learning – cluster analysis Clustering methods: DBSCAN; Clustering evaluation metrics and methods;
Week 12 Unsupervised learning – cluster analysis Clustering methods: Spectral clustering; In-class (30min) quiz
Week 13 Unsupervised learning – latest topics Multi-view clustering; Deep learning methods: auto-encoder
Week 14 Deep learning Deep learning (supervised) methods: CNN

Deep learning generative methods: GAN

Deep reinforcement learning

Additional time Project presentations Student teams present their final term projects
Final Week Final quiz A final comprehensive quiz

 


COURSE POLICY:

  1. Computers are allowed in classroom for taking notes or any activity related to the current class meeting.
  2. Participation in paper review sessions itself will earn 50% credits for each review assignment. The quality of your presentation slides will be judged by the instructor and other students, and constitutes the other 50% credits.
  3. Quizzes and homework assignments are graded by the teaching assistant.
  4. Final term projects will be graded by an instructor, the teaching assistant, and senior graduate students in Prof. Bi’s lab based on the clarity and correctness of the project report and the comparison of final presentations of all teams.
  5. We may need to use github to manage the final projects, so students are encouraged to get familiar with github.

MAKEUP PLAN:

  1. If you cannot attend a review session that you are supposed to present, you need to find your team member to cover you.
  2. If a quiz is missed, students need to discuss with the instructor to find a place/time to make up the quiz. 

HUSKYCT:

A HuskyCT site has been set up for the class. You can access it by logging in with your NetID and password.  You must use HuskyCT for submitting assignments.  The instructor uses the HuskyCT announcement to announce class materials, grades, problem clarifications, changes in class schedule, and other class announcements.


ACADEMIC INTEGRITY:

You are expected to adhere to the highest standards of academic honesty. Unless otherwise specified, collaboration on assignments is not allowed. Use of published materials is allowed, but the sources should be explicitly stated in your solutions. Violations will be reviewed and sanctioned according to the University Policy on Academic Integrity. Collaborations among team members are only allowed for the final term projects that are selected.

“Academic integrity is the pursuit of scholarly activity free from fraud and deception and is an educational objective of this institution. Academic dishonesty includes, but is not limited to, cheating, plagiarizing, fabricating of information or citations, facilitating acts of academic dishonesty by others, having unauthorized possession of examinations, submitting work for another person or work previously used without informing the instructor, or tampering with the academic work of other students.”


DISABILITY STATEMENT:

If you have a documented disability for which you are or may be requesting an accommodation, you are encouraged to contact the instructor and the Center for Students with Disabilities or the University Program for College Students with Learning Disabilities as soon as possible to better ensure that such accommodations are implemented in a timely fashion.


Jinbo Bi ©2018/1-2018/4
Last revised: 1/13/2018