**COURSE SYLLABUS AND OUTLINE**

**CSE-5820**

**MACHINE LEARNING**** **

** **

LECTURE

ROWE 320, MW 10:00am – 11:15am

INSTRUCTOR
Phone: 486-1458 Office hour: by appointment Office: ITEB 233 |
INSTRUCTOR
Phone: 486-0510 Office hour: M 11:30am-12:30pm Office: ITEB 267 |
TEACHING ASSISTANT
Phone: 486-0510 Office hour: F 3-4pm Office: ITEB 213 |

PURPOSE AND APPROACH:

The objective of the course is to introduce basic and advanced concepts in machine learning to students, and enable them to use machine learning methods in real-life applications, and review state of the art literature in machine learning. This course covers basic topics of machine learning, such as supervised learning (classification, regression, feature selection etc), unsupervised learning (clustering, dimension reduction or component analysis etc) and some advanced topics, such as scalable machine learning, or deep learning. This course will also attempt if time allows to expose students to emerging topics in the field, such as multi-task learning, multi-view data analysis, structural learning, etc. Usually these topics will be identified by reviewing the latest publications in top venues. Because of the diversity in machine learning topics, the materials covered in this course may vary among semesters.

The course consists of lectures, paper reviews, quizzes, homeworks and projects. Lectures will serve as the vehicle for the instructor to introduce concepts and knowledge to students. Paper reviews are used to inform students of the latest research topics and techniques. Quizzes are used to test if certain basic concepts have been mastered. The course may also contain guest lectures from related fields and respective experts. A course project will be used for students to get profound hands-on experience by programming certain machine learning algorithms identified from the recent literature. Participation in lectures is encouraged during the class.

Students are encouraged to form study groups to facilitate discussion. Each group is expected to consist of two to three students. During paper review process, each group can select a paper from the recent machine learning venues provided by the instructor, such as International Conference on Machine Learning, or Neural Information Processing Systems, study and present the paper in class. As part of the course, the students will work on a term project where each group can choose to implement the algorithms discussed in the paper they choose to present, or select one of the papers provided by the instructor to study and implement the algorithms in that respective paper. Each team is required to present in the classroom and submit a project report, which describes whether the algorithm could be replicated in their implementation as stated in the original paper, if not, any potential reason why. This exercise will help the team gain much deeper insights into certain algorithms and promote collaborations among team members.

COURSE OBJECTIVES:

- Master basic concepts of machine learning, such as overfitting, statistical learning theory, different kinds of learning problems
- Understand some commonly-used machine learning algorithms, such as k-means, logistic regression, support vector machines
- Get informed of the state of the art in the field
- Experiment with machine learning methods in solving practical problems

TEXTBOOKS:

- Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, ISBN-10: 0321321367
- Pattern Recognition and Machine Learning (Information Science and Statistics) by Christopher M. Bishop, ISBN-10: 0387310738
- Deep Learning (Adaptive Computation and Machine Learning Series) by Ian Goodfellow, Yoshua Bengio, and Aaron Courville, ISBN-10: 0262035618
- Pattern Classification (2
^{nd}Edition) by Richard O. Duda, Peter E. Hart and David G. Stork, ISBN-10: 0471056693

None of the textbooks will be required. However, having one or two of them may complement and expand the materials discussed in lectures. Lectures will come with slide files and tutorial/review papers for students to study after lectures.

GRADING:

- Open-book open-notes quizzes (3 in-class quizzes and 1 final quiz): 40%
- Paper review (1): 10%
- Term Project (1): a team can only consist of two to three persons, 30% (implementing selected algorithms from a recent publication)
- Programming-involved homework assignment (3): 15% based on performance ranking
- Other non-programming homework assignment (2-3): 5%

TENTATIVE SCHEDULE:

Week 1 | Introduction | Understand the differences of supervised and unsupervised learning problems; Introduce a variety of different kinds of learning tasks |

Week 2 | Review of basic mathematical/statistical skills | Basic linear algebra – vector and matrix computation, matrix decomposition, norms; Basic statistics – probability, multivariate distribution, mean and variance |

Week 3 | Supervised learning – regression | Regression methods: least squares, ridge regression, LASSO, gradient descent, probabilistic interpretations of these methods |

Week 4 | Supervised learning – regression | Overfitting, regularization, generalized gradient descent for non-smooth optimization (e.g., LASSO) |

Week 5 | Supervised learning – latest topics | Finish regression methods; New topic: Multi-task learning; In-class (30min) quiz |

Week 6 | Supervised learning – classification | Classification methods: k-nearest neighbor, linear discriminant analysis, classification evaluation metrics |

Week 7 | Supervised learning – classification | Classification methods: logistic regression, support vector machine |

Week 8 | Supervised learning – latest topics | Classification methods: (shallow) neural networks and backpropagation; In-class (30min) quiz |

Week 9 | Paper review | Student teams present papers from the latest ML venues |

Week 10 | Unsupervised learning – cluster analysis | Clustering methods: K-means, hierarchical clustering |

Week 11 | Unsupervised learning – cluster analysis | Clustering methods: DBSCAN; Clustering evaluation metrics and methods; |

Week 12 | Unsupervised learning – cluster analysis | Clustering methods: Spectral clustering; In-class (30min) quiz |

Week 13 | Unsupervised learning – latest topics | Multi-view clustering; Deep learning methods: auto-encoder |

Week 14 | Deep learning | Deep learning (supervised) methods: CNN
Deep learning generative methods: GAN Deep reinforcement learning |

Additional time | Project presentations | Student teams present their final term projects |

Final Week | Final quiz | A final comprehensive quiz |

COURSE POLICY:

- Computers are allowed in classroom for taking notes or any activity related to the current class meeting.
- Participation in paper review sessions itself will earn 50% credits for each review assignment. The quality of your presentation slides will be judged by the instructor and other students, and constitutes the other 50% credits.
- Quizzes and homework assignments are graded by the teaching assistant.
- Final term projects will be graded by an instructor, the teaching assistant, and senior graduate students in Prof. Bi’s lab based on the clarity and correctness of the project report and the comparison of final presentations of all teams.
- We may need to use github to manage the final projects, so students are encouraged to get familiar with github.

MAKEUP PLAN:

- If you cannot attend a review session that you are supposed to present, you need to find your team member to cover you.
- If a quiz is missed, students need to discuss with the instructor to find a place/time to make up the quiz.

HUSKYCT:

A HuskyCT site has been set up for the class. You can access it by logging in with your NetID and password. You must use HuskyCT for submitting assignments. The instructor uses the HuskyCT announcement to announce class materials, grades, problem clarifications, changes in class schedule, and other class announcements.

ACADEMIC INTEGRITY:

You are expected to adhere to the highest standards of academic honesty. Unless otherwise specified, collaboration on assignments is not allowed. Use of published materials is allowed, but the sources should be explicitly stated in your solutions. Violations will be reviewed and sanctioned according to the University Policy on Academic Integrity. Collaborations among team members are only allowed for the final term projects that are selected.

“Academic integrity is the pursuit of scholarly activity free from fraud and deception and is an educational objective of this institution. Academic dishonesty includes, but is not limited to, cheating, plagiarizing, fabricating of information or citations, facilitating acts of academic dishonesty by others, having unauthorized possession of examinations, submitting work for another person or work previously used without informing the instructor, or tampering with the academic work of other students.”

DISABILITY STATEMENT:

If you have a documented disability for which you are or may be requesting an accommodation, you are encouraged to contact the instructor and the Center for Students with Disabilities or the University Program for College Students with Learning Disabilities as soon as possible to better ensure that such accommodations are implemented in a timely fashion.