COURSE SYLLABUS AND OUTLINE
CSE-5820
MACHINE LEARNING
LECTURE
ROWE 320, MW 10:00am – 11:15am
INSTRUCTOR
Jinbo Bi Phone: 486-1458 Office hour: M 1-2pm Office: ITEB 233 |
INSTRUCTOR
Jin Lu Phone: 486-0510 Office hour: W 1-2pm Office: ITEB 267 |
TEACHING ASSISTANT
TBD Phone: 486-0510 Office hour: by appt Office: ITEB 213 |
PURPOSE AND APPROACH:
The objective of the course is to introduce basic and advanced concepts in machine learning to students, and enable them to use machine learning methods in real-life applications, and review state of the art literature in machine learning. This course covers basic topics of machine learning, such as supervised learning (classification, regression, feature selection etc), unsupervised learning (clustering, dimension reduction or component analysis etc.) and some advanced topics, such as scalable machine learning, or deep learning. This course will also attempt if time allows to expose students to emerging topics in the field, such as multi-task learning, multi-view data analysis, structural learning, etc. Usually these topics will be identified by reviewing the latest publications in top machine learning venues. Because of the diversity in machine learning topics, the materials covered in this course may vary semester by semester.
The course consists of lectures, quizzes, homeworks, paper reviews, and projects. Lectures will serve as the vehicle for the instructors to introduce concepts and knowledge to students. Quizzes are used to test if certain basic concepts have been mastered. The course may also contain guest lectures from related fields and respective experts. Paper reviews are used to inform students of the latest research topics and techniques and for them to prepare a term project. A course project will be used for students to get profound hands-on experience by programming certain machine learning algorithms identified from the recent literature. Participation in lectures is encouraged.
Students are encouraged to form study groups to facilitate discussion. Each group is expected to consist of only two to three students. During paper review process, each group can select a paper from the recent machine learning venues, such as International Conference on Machine Learning, or Neural Information Processing Systems, study and present the paper in class. As part of the course, the students will work on a term project where each group can choose to implement the algorithms discussed in the paper they choose to present, or select one of the papers provided by the instructor to study and implement the algorithms in that respective paper. Each team is required to present in the classroom and submit a project report, which describes whether the algorithm could be replicated in their implementation as stated in the original paper, if not, any potential reason why. This exercise will help the team gain much deeper insights into certain algorithms and promote collaborations among team members.
COURSE OBJECTIVES:
- Master basic concepts of machine learning, such as overfitting, statistical learning theory, different kinds of learning problems
- Understand some commonly-used machine learning algorithms, such as k-means, logistic regression, support vector machines
- Get informed of the state of the art in the field, such as deep learning
- Experiment with machine learning methods in solving practical problems
TEXTBOOKS:
- Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, ISBN-10: 0321321367
- Pattern Recognition and Machine Learning (Information Science and Statistics) by Christopher M. Bishop, ISBN-10: 0387310738
- Deep Learning (Adaptive Computation and Machine Learning Series) by Ian Goodfellow, Yoshua Bengio, and Aaron Courville, ISBN-10: 0262035618
None of the textbooks will be required. However, having one or two of them may complement and expand the materials discussed in lectures. Lectures will come with slide files and tutorial/review papers for students to study after lectures.
GRADING:
- Open-book open-notes quizzes (3 in-class quizzes): 30%
- Final quiz (close-book and close-notes): 20%
- Term Project (1): a team can only consist of two to three persons, 30% (implementing selected algorithms from a recent publication)
- Homework assignment (3-6): (20%) will not be graded, but materials in assignments may be randomly used in the quizzes
TENTATIVE SCHEDULE:
Week 1 | Introduction | Understand the differences of supervised and unsupervised learning problems; Introduce a variety of different kinds of learning tasks; and student background evaluation |
Week 2 | Review of basic mathematical/statistical skills | Basic linear algebra – vector and matrix computation, matrix decomposition, norms; Basic statistics – probability, multivariate distribution, mean and variance |
Week 3 | Supervised learning – regression | Regression methods: least squares, ridge regression, LASSO, gradient descent, probabilistic interpretations of these methods |
Week 4 | Supervised learning – regression | Overfitting, regularization, generalized gradient descent for non-smooth optimization (e.g., LASSO) |
Week 5 | Supervised learning – regression and classification | Finish regression methods; Start classification methods: linear discriminant analysis, classification evaluation metrics (In-class quiz 30 min) |
Week 6 | Supervised learning – classification | Classification methods: logistic regression, support vector machine |
Week 7 | Supervised learning – classification | Classification methods: (shallow) neural networks and backpropagation; |
Week 8 | Deep learning | Deep multi-layer perceptron, autoencoder, and stochastic gradient descent (In-class quiz 30 min) |
Week 9 | Paper review for the projects | Student teams present papers from the latest ML venues |
Week 10 | Deep learning | Convolutional neural networks |
Week 11 | Unsupervised learning – cluster analysis | Clustering methods: K-means, hierarchical clustering; |
Week 12 | Unsupervised learning – cluster analysis | DBSCAN; Clustering evaluation metrics and methods; |
Week 13 | Unsupervised learning – latest topics | Spectral clustering; (In-class quiz 30min) |
Week 14 | Some recent ML topics | Multi-task learning; Multi-view clustering; GAN etc. |
Additional time | Project presentations | Student teams present their final term projects |
Final Week | Final quiz | A final comprehensive quiz (close-book, close-notes) |
COURSE POLICY:
- Computers are allowed in classroom for taking notes or any activity related to the current class meeting.
- Participation in paper review sessions itself will earn 50% credits for each review assignment. The quality of your presentation slides will be judged by the instructors and other students, and constitutes the other 50% credits.
- Quizzes are graded by either of the instructors.
- Homework assignments will not be graded but will be checked.
- Final term projects will be graded by an instructor, potential teaching assistants in Prof. Bi’s lab based on the clarity and correctness of the project report and the comparison of final presentations of all teams.
- We may need to use github to manage the final projects, so students are encouraged to get familiar with github.
MAKEUP PLAN:
- If you cannot attend a review session that you are supposed to present, you need to find your team member to cover you.
- If a quiz is missed, students need to discuss with the instructor to find a place/time to make up the quiz.
HUSKYCT:
A HuskyCT site has been set up for the class. You can access it by logging in with your NetID and password. You must use HuskyCT for submitting assignments. The instructors use the HuskyCT announcement to announce class materials, grades, problem clarifications, changes in class schedule, and other class announcements.
ACADEMIC INTEGRITY:
You are expected to adhere to the highest standards of academic honesty. Unless otherwise specified, collaboration on assignments is not allowed. Use of published materials is allowed, but the sources should be explicitly stated in your solutions. Violations will be reviewed and sanctioned according to the University Policy on Academic Integrity. Collaborations among team members are only allowed for the final term projects that are selected.
“Academic integrity is the pursuit of scholarly activity free from fraud and deception and is an educational objective of this institution. Academic dishonesty includes, but is not limited to, cheating, plagiarizing, fabricating of information or citations, facilitating acts of academic dishonesty by others, having unauthorized possession of examinations, submitting work for another person or work previously used without informing the instructor, or tampering with the academic work of other students.”
DISABILITY STATEMENT:
If you have a documented disability for which you are or may be requesting an accommodation, you are encouraged to contact the instructor and the Center for Students with Disabilities or the University Program for College Students with Learning Disabilities as soon as possible to better ensure that such accommodations are implemented in a timely fashion.
Jinbo Bi ©2019/1-2019/4
Last revised: 1/16/2019