Not logged in. Login

Final Project

Project Overview

For your final project, envision yourselves as professional data scientists tasked with a real-world challenge. Your mission is to identify compelling questions, source appropriate datasets, and execute a robust data science workflow to derive meaningful answers. To successfully navigate this journey, please follow these steps:

  1. Assemble a data science team comprising 3-4 members.
  2. Develop and submit a detailed project proposal.
  3. Deliver a concise, 5-minute Milestone Presentation to showcase your progress.
  4. Effectively communicate your findings in a poster session, demonstrating the impact and insights of your project.
  5. Prepare a professional poster for digital and in-person presentations
  6. Submit all relevant materials: the project code, a demonstrative video, and a comprehensive report.

Todo List

The following table summarizes the TODO list of the final project.

ID What When Where
1 Proposal Tue Feb 14 at 11:59 PM Submit the filled form to CourSys
2 Milestone Friday Mar 7 at 11:30 AM Submit your presentation slides to CourSys
3a Poster Submission Tuesday Apr 8 at 8:00 AM Submit your poster presentation to CourSys
3b Poster Presentation Tuesday Apr 8 at 9:00 AM Present your project in ASB 10900
4 Final Report Friday 04/12 at 11:59 PM Submit your video and report to CourSys
5 MPCS Innovation Prize Competition Tuesday May 6 at 1:00 PM Present your printed poster at the Segal Centre (Vancouver Campus)

Project Ideas

To evaluate whether your project is good or not, please ask yourself the following three questions:

  1. Is it important? (i.e., what impacts can your project make?)
  2. Is it challenging? (i.e., does a naive solution work very well?)
  3. What can I learn by doing the project? (i.e., new tools, new techniques, new domain knowledge, new methodologies).

A good project should be important, be challenging, and be able to push you to learn something that you don't know before.

Note that you need to conduct a deep analysis of the data. By deep analysis, I mean you have to think deeply about your analysis results, and report some insightful and reliable findings.

Please see our page on externally sponsored projects and get in touch, if you and your team are interested in collaborating with an external client on the sponsored idea.

Below is a list of project ideas from previous years. I do not recommend selecting from the list since they have been done by former students.

  1. Machine learning based surveillance system using transfer learning for rare diseases (2021)

  2. Automatic Hierarchical Time-Series Forecast at Different Aggregation Levels for Fashion Products (2020)

  3. Elevator Anomaly Detection System (2020)

  4. Explain Data and Interpretable Machine Learning (2020)

  5. Explore the Impact of Weather on Short-time Demand Forecast for Fashion Retailers (2020)

  6. Incident Social Listening (2020)

  7. Machine Learning Applied to Web Scraping (2020)

  8. Medical Language Understanding (2020)

  9. Model Fairness & Transparency (2020)

  10. Object-Detection-in-X-Ray-Images (2020)

  11. Analyzing Social media user interaction (2019)

  12. LTF - Big Data Financial Analysis (2019)

  13. Measuring Observable Influence and Impact of Scientific Research Beyond Academia (2019)

  14. A prototype Canadian Natural Hazards Database (2018)

  15. Automated Feature Detection of Aerial Imagery from South Pacific (2018)

  16. Fall Detection using wearable sensor data (2018)

  17. Machine learning to detect misstated financial statements (2018)

  18. Predicting Soccer games and tournaments (2018)

  19. Predictive Maintenance on IOT devices (2018)

  20. Property value prediction with market data (2018)

  21. Topic modeling and visualization of news comments (2018)

Step-by-step Instruction

1. Proposal (5 points)

  • Download the Initial Plan form template
  • Submit the filled form to CourSys

2. Milestone Presentation (10 points)

Communication skills are super important for data scientists. Please use this opportunity to practice your communication skills.

You can think of this presentation as a mid-term report for your project. Your presentation should consist of three parts:

  1. Motivation (2)

    • Why is it an important project? (1)
    • Why is it challenging? (1)
  2. Progress Report (2)

    • What have you done so far? (1) You need to provide evidence (e.g., screenshots of repo and commits, an initial demo) for your progress.
    • Is it on schedule? (1) You need to show the entire project schedule and point to where you are.
  3. Future work (2)

    • What do you plan to do next? (1) You need to show the detailed schedule of the remaining part of the project.
    • How to mitigate risks? (1) You also need to discuss whether there is any risk to complete the project on time.

Imagine your manager (who knows little about the technical part of data science) is sitting in the audience, you need to explain your complex project to your manager in a simple way, and make her/him feel excited about it.

  • Did you convey complex information in a simple way? (2)
  • Did you excite and motivate the audience? (2)

Search "how to give a good talk" on Google or ChatGPT. You will find a lot of good advice. Use them to improve your presentation.

Submission

  • Submit your presentation slides (PPT) to CourSys.

3. Poster Presentation, on-screen (20 points)

This is showtime! Make a poster to present your data product. Please make your poster look as professional as possible. Here are a few things that you can put on the poster (10 points):

  • Why do you do this project?

  • What questions do you try to answer?

  • What's your methodology to get the answers?

  • What datasets/tools do you use?

  • What's your data-science pipeline like?

  • Why is your solution good? Why does your result make sense?

  • What's your data product?

  • What have you learned through the project?

  • What do you plan to do if you have more time?

Design tips:

  • https://www.brightcarbon.com/blog/effective-academic-posters-powerpoint/
  • Use high-quality images that help draw attention and convey important information.
  • Keep it simple and uncluttered, use white space, limit text and images, and avoid distracting design elements.
  • Use a clear and concise title to convey the topic of your project.
  • Organize your content logically: Your poster should be organized in a way that is easy to follow and understand. Use headings, subheadings, and bullet points to guide the reader.
  • Use a hierarchy of font sizes with a font that's easy-to-read from a distance, sans-serif works well.

Poster Format and Presentation

The maximum poster sizes are: Landscape (36" x 24") and portrait versions (24" x 36"). You can opt for either of the two. We encourage you to use the template, but a custom design is also possible if you prefer.

During the presentation session (10 points), you will be given 8 mins to present your project. Please utilize at least 6 minutes of that, but do not exceed 8 minutes. TAs and instructors will ask a few questions after your presentation.

Submission

The project presentations are scheduled on Tuesday April 8th, 2025 from 9 am - 12 noon. Please upload your poster as a PDF (and optionally the PPTX if using PowerPoint) to CourSys before 8:00 AM. Check your time slot in the tentative final presentation schedule.

Also, note that you do NOT need to bring your poster to the poster session on April 9th, as display will be digital.

However, please do bring a printed version to the MPCS Innovation Prize Competition on May 6th.

4. Video & Code & Report (30 points)

Code (10 points)

Like CMPT 732, you must use a Git repository for your project. Consider to use SFU's GitHub server is a good way to get one (instructions at that link). Group members must commit their own contributions to the repo. You are encouraged to publicize and open-source your work on GitHub or similar.

In your repository, please include a file README.txt (or README.md if you prefer) indicating how we can actually test your project as well as other notes about things we should look for. If you created some kind of web frontend, please include a URL in the README.md as well.

Report (10 points)

You need to submit a report giving an overview of your project. The report should have at about 2000 words with the following structure:

  • Project Title: Come up with an attractive project title (see this page for some tips);
  • Motivation and Background: Who cares about this project? Any related work?
  • Problem Statement: What questions do you want to answer? Why are they challenging?
  • Data Science Pipeline: What's your data-science pipeline like? Describe each component in detail.
  • Methodology: What tools or analysis methods did you use? Why did you choose them? How did you apply them to tackling each problem?
  • Evaluation: Why is your solution good? Why does your result make sense?
  • Data Product: What's your data product? Please demonstrate how it works.
  • Lessons Learnt: What did you learn from this project?
  • Summary: A high-level summary of your project. It should be self-contained and cover all the important aspects of your project.

Please choose A or B:

Video (10 points)

Please make an attractive video to introduce your project (The format is similar to Example 1 and Example 2). Here are some requirements:

  1. The video length should be 3 mins for ideal viewer attention, but can be up to 6 minutes.
    • As a fallback option, you may extract your project video from the recorded Zoom session on April 8th (we will share the recording)
    • Note the video is a graded component of your final presentation
  2. Explain why this is an important project
  3. List the questions you want to answer as well as the datasets you collected
  4. Give a high-level idea on how you use data science skills to answer those questions
  5. Need to show the conclusion of your project
  6. Put your contact information at the end of the video

You can get some inspirations from KDD 2017 Promotional Videos, KDD 2018 Promotional Videos, 2018 Project Showcase, and 2019 Project Showcase.

5. MPCS Innovation Prize Poster Competition – May 6

Select teams will be invited to present their work at the MPCS Student Project Innovation Prize Competition, showcasing innovation in Big Data, Visual Computing, and Cybersecurity.

Event Details

  • Date: Tuesday, May 6
  • Time: 1:00–3:00 PM
  • Location: Segal Centre (Vancouver Campus)
  • Format: In-person poster session with refreshments

Overview

  • Selected teams will present posters to the MPCS External Advisory Council
  • Judges will evaluate innovation, impact, and communication
  • 4 winning projects will be awarded $2,500 each
  • Winners announced after the event

This is a great opportunity to celebrate your work and network with industry professionals.

Submission

We will create a web page on the course website and put your projects there. On the page, we will put a project title, a project summary, and the three URLs that link to your codebase, video, and final report. Please submit your project title, project summary, final report (Medium URL or PDF), code (Github/GitLab URL), and video (Youtube URL) to CourSys.

Updated Tue April 08 2025, 12:57 by sbergner.