CMPT 733: Big Data Programming II
Objectives
This course is designed for students who have completed CMPT 726 and CMPS 732, and want to further their knowledge and skills in data science and big data. It aims to bridge the gap between theoretical concepts and practical applications of machine learning and data engineering, by exposing them to current trends and challenges in data science and big data.
The course will cover essential topics such as data wrangling, data visualization, data storytelling, and machine learning workflows, and introduce students to cutting-edge techniques and tools for dealing with large-scale and complex data. By the end of this course, students should be able to tackle real-world data problems, ask meaningful questions about data, design effective data-processing pipelines, and communicate their findings.
Topics
- Introduction to Data Science
- Data Preparation
- Visualization
- Statistics
- Deep Learning
- Practical Machine Learning (AutoML, Explainable AI, Feature Engineering)
- Anomaly Detection
- Cloud Computing
- Responsible Data Science
- Communication
Logistics
Instructors
TAs
- Chunyu Chen
- Paria Khoshtab
- Xubin Wang
You can reach the entire team of instructors and TAs via email to cmpt-733-g1-help@sfu.ca
Lectures
- Time: Tue 10:30 PM - 12:20 PM
- Location: AQ5016
Labs
Lab G101:
- Time: Wed 11:30 AM to 1:20 PM and Fri 11:30 AM to 1:20 PM
- Location: SECB1010
Lab G102:
- Time: Wed 1:30 PM to 3:20 PM and Fri 1:30 PM to 3:20 PM
- Location: SECB1010
Grading
Tentative, to be finalized during week 1.
- Assignments: 10 × 4% = 40%
- In-lab team exercise: 5%
- Blog post: 8%
- Final Project: 47% (2% proposal + 15% milestone + 15% final presentation + 15% code & report & video)
Schedule
Week | Date | Event Type | Description | Course Materials |
---|---|---|---|---|
wk 1 | Tue Jan 7 | Lecture 1 | Course Introduction | slides |
Wednesday January 15 2025 | A1-1 Due | Assignment #1-1 Due | A1-1 | |
[No activity "A1-2"] | A1-2 Due | Assignment #1-2 Due | A1-2 | |
wk 2 | Tue Jan 14 | Lecture 2 | Data Preparation | slides |
Friday January 24 2025 | A2 Due | Assignment #2 Due | A2 | |
wk 3 | Tue Jan 21 | Lecture 3 | Statistics (Part I) | slides |
Monday February 03 2025 | A3 Due | Assignment #3 Due | A3 | |
wk 4 | Tue Jan 28 | Lecture 4 | Data Visualization (Part I) | slides |
Monday February 10 2025 | A4 Due | Assignment #4 Due | A4 | |
wk 5 | Tue Feb 4 | Lecture 5 | Practical Machine Learning (Part I) | slides |
Friday February 14 2025 | A5 Due | Assignment #5 Due | A5 | |
wk 6 | Tue Feb 11 | Lecture 6 | Deep Learning (Part I) | slides |
Monday February 10 2025 | Blog Post Due | Blog Post Task | ||
Friday February 14 2025 | Proposal Due | Course Project Proposal Due | ||
Tue Feb 18 | Reading Break | No Classes | ||
wk 7 | Tue Feb 25 | Lecture 7 | Data Visualization (Part II) | slides |
Monday March 17 2025 | A7 Due | Assignment #7 Due | A7 | |
wk 8 | Tue Mar 4 | Lecture 8 | Practical Machine Learning (Part II) | slides |
Fri Mar 7 | Milestone Presentation | |||
Friday March 21 2025 | A8 Due | Assignment #8 Due | A8-1, A8-2 | |
wk 9 | Tue Mar 11 | Lecture 9 | Statistics (Part II) | slides |
Monday March 31 2025 | A9 Due | Assignment #9 Due | A9-1, A9-2 | |
wk 10 | Tue Mar 18 | Lecture 10 | Deep learning (Part II), Natural Language Processing | slides |
Tuesday April 15 2025 | A10 Due | Assignment #10 Due | A10 lec10-nlp-ds | |
wk 11 | Tue Mar 25 | Lecture 11 | Responsible Data Science | slides |
[No activity "A11-1"] | A11 Due | Assignment #11 Due | A11-1, A11-2 | |
wk 12 | Tue Apr 1 | NO class - Final project prep | ||
Fri Apr 4 | ||||
wk 13 | Tue Apr 8 | Final Project Presentation & Code | Course Project Presentation | |
Fri Apr 11 | Report & Video Due | Course Project Report Due |