Not logged in. Login

CMPT 733: Big Data Programming II

Objectives

This course is designed for students who have completed CMPT 726 and CMPS 732, and want to further their knowledge and skills in data science and big data. It aims to bridge the gap between theoretical concepts and practical applications of machine learning and data engineering, by exposing them to current trends and challenges in data science and big data.

The course will cover essential topics such as data wrangling, data visualization, data storytelling, and machine learning workflows, and introduce students to cutting-edge techniques and tools for dealing with large-scale and complex data. By the end of this course, students should be able to tackle real-world data problems, ask meaningful questions about data, design effective data-processing pipelines, and communicate their findings.

Topics

  • Introduction to Data Science
  • Data Preparation
  • Visualization
  • Statistics
  • Deep Learning
  • Practical Machine Learning (AutoML, Explainable AI, Feature Engineering)
  • Anomaly Detection
  • Cloud Computing
  • Responsible Data Science
  • Communication

Logistics

Instructors

TAs

  • Chunyu Chen
  • Paria Khoshtab
  • Xubin Wang

You can reach the entire team of instructors and TAs via email to cmpt-733-g1-help@sfu.ca

Lectures

  • Time: Tue 10:30 PM - 12:20 PM
  • Location: AQ5016

Labs

Lab G101:

  • Time: Wed 11:30 AM to 1:20 PM and Fri 11:30 AM to 1:20 PM
  • Location: SECB1010

Lab G102:

  • Time: Wed 1:30 PM to 3:20 PM and Fri 1:30 PM to 3:20 PM
  • Location: SECB1010

Grading

Tentative, to be finalized during week 1.

  1. Assignments: 10 × 4% = 40%
  2. In-lab team exercise: 5%
  3. Blog post: 8%
  4. Final Project: 47% (2% proposal + 15% milestone + 15% final presentation + 15% code & report & video)

Schedule

WeekDateEvent TypeDescriptionCourse Materials
wk 1Tue Jan 7Lecture 1Course Introductionslides
Wednesday January 15 2025A1-1 DueAssignment #1-1 DueA1-1
[No activity "A1-2"]A1-2 DueAssignment #1-2 DueA1-2
wk 2Tue Jan 14Lecture 2Data Preparationslides
Friday January 24 2025A2 DueAssignment #2 DueA2
wk 3Tue Jan 21Lecture 3Statistics (Part I)slides
Monday February 03 2025A3 DueAssignment #3 DueA3
wk 4Tue Jan 28Lecture 4Data Visualization (Part I)slides
Monday February 10 2025A4 DueAssignment #4 DueA4
wk 5Tue Feb 4Lecture 5Practical Machine Learning (Part I)slides
Friday February 14 2025A5 DueAssignment #5 DueA5
wk 6Tue Feb 11Lecture 6Deep Learning (Part I)slides
Monday February 10 2025Blog Post DueBlog Post Task
Friday February 14 2025Proposal DueCourse Project Proposal Due
Tue Feb 18Reading BreakNo Classes
wk 7Tue Feb 25Lecture 7Data Visualization (Part II)slides
Monday March 17 2025A7 DueAssignment #7 DueA7
wk 8Tue Mar 4Lecture 8Practical Machine Learning (Part II)slides
Fri Mar 7Milestone Presentation
Friday March 21 2025A8 DueAssignment #8 DueA8-1, A8-2
wk 9Tue Mar 11Lecture 9Statistics (Part II)slides
Monday March 31 2025A9 DueAssignment #9 DueA9-1, A9-2
wk 10Tue Mar 18Lecture 10Deep learning (Part II), Natural Language Processingslides
Tuesday April 15 2025A10 DueAssignment #10 DueA10 lec10-nlp-ds
wk 11Tue Mar 25Lecture 11Responsible Data Scienceslides
[No activity "A11-1"]A11 DueAssignment #11 DueA11-1, A11-2
wk 12Tue Apr 1NO class - Final project prep
Fri Apr 4
wk 13Tue Apr 8Final Project Presentation & CodeCourse Project Presentation
Fri Apr 11Report & Video DueCourse Project Report Due

Final Project Showcase Examples

References

Updated Wed Jan. 22 2025, 12:35 by sbergner.