Not logged in. Login

Final Project

Project Overview

For your final project, envision yourselves as professional data scientists tasked with a real-world challenge. Your mission is to identify compelling questions, source appropriate datasets, and execute a robust data science workflow to derive meaningful answers. To successfully navigate this journey, please follow these steps:

  1. Assemble a data science team comprising 3-5 members.
  2. Develop and submit a detailed project proposal.
  3. Deliver a concise, 5-minute Milestone Presentation to showcase your progress.
  4. Effectively communicate your findings in a poster session, demonstrating the impact and insights of your project.
  5. Submit all relevant materials: the project code, a demonstrative video, and a comprehensive report.

Todo List

The following table summarizes the TODO list of the final project.

ID What When Where
1 Proposal Tue 02/20 at 11:59 PM Submit the filled form to CourSys
2 Milestone Friday 03/8 at 9:30 AM Submit your poster and presentation video to CourSys
3 Poster Presentation Tuesday 04/09 at 8:00 AM Tuesday 04/09 at 10:30 AM Submit your poster to CourSys Present your project in ASB 10900
4 Final Report Friday 04/12 at 11:59 PM Submit your video and report to CourSys

Project Ideas

To evaluate whether your project is good or not, please ask yourself the following three questions:

  1. Is it important? (i.e., what impacts can your project make?)
  2. Is it challenging? (i.e., does a naive solution work very well?)
  3. What can I learn by doing the project? (i.e., new tools, new techniques, new domain knowledge, new methodologies).

A good project should be important, be challenging, and be able to push you to learn something that you don't know before.

Note that you need to conduct a deep analysis of the data. By deep analysis, I mean you have to think deeply about your analysis results, and report some insightful and reliable findings.

Please see our page on externally sponsored projects and get in touch, if you and your team are interested in collaborating with an external client on the sponsored idea.

Below is a list of project ideas from previous years. I do not recommend selecting from the list since they have been done by former students.

  1. Machine learning based surveillance system using transfer learning for rare diseases (2021)

  2. Automatic Hierarchical Time-Series Forecast at Different Aggregation Levels for Fashion Products (2020)

  3. Elevator Anomaly Detection System (2020)

  4. Explain Data and Interpretable Machine Learning (2020)

  5. Explore the Impact of Weather on Short-time Demand Forecast for Fashion Retailers (2020)

  6. Incident Social Listening (2020)

  7. Machine Learning Applied to Web Scraping (2020)

  8. Medical Language Understanding (2020)

  9. Model Fairness & Transparency (2020)

  10. Object-Detection-in-X-Ray-Images (2020)

  11. Analyzing Social media user interaction (2019)

  12. LTF - Big Data Financial Analysis (2019)

  13. Measuring Observable Influence and Impact of Scientific Research Beyond Academia (2019)

  14. A prototype Canadian Natural Hazards Database (2018)

  15. Automated Feature Detection of Aerial Imagery from South Pacific (2018)

  16. Fall Detection using wearable sensor data (2018)

  17. Machine learning to detect misstated financial statements (2018)

  18. Predicting Soccer games and tournaments (2018)

  19. Predictive Maintenance on IOT devices (2018)

  20. Property value prediction with market data (2018)

  21. Topic modeling and visualization of news comments (2018)

Step-by-step Instruction

1. Proposal (5 points)

  • Download the Initial Plan form template
  • Submit the filled form to CourSys

2. Milestone Presentation (10 points)

Communication skills are super important for data scientists. Please use this opportunity to practice your communication skills.

You can think of this presentation as a mid-term report for your project. Your presentation should consist of three parts:

  1. Motivation (2)

    • Why is it an important project? (1)
    • Why is it challenging? (1)
  2. Progress Report (2)

    • What have you done so far? (1) You need to provide evidence (e.g., screenshots of repo and commits, an initial demo) for your progress.
    • Is it on schedule? (1) You need to show the entire project schedule and point to where you are.
  3. Future work (2)

    • What do you plan to do next? (1) You need to show the detailed schedule of the remaining part of the project.
    • How to mitigate risks? (1) You also need to discuss whether there is any risk to complete the project on time.

Imagine your manager (who knows little about the technical part of data science) is sitting in the audience, you need to explain your complex project to your manager in a simple way, and make her/him feel excited about it.

  • Did you convey complex information in a simple way? (2)
  • Did you excite and motivate the audience? (2)

Search "how to give a good talk" on Google or ChatGPT. You will find a lot of good advice. Use them to improve your presentation.

Submission

  • Make a video to record your presentation (The format is similar to Example 1 and Example 2).

  • The video length should be within 5 mins (but longer than 4 mins).

  • Submit your PPT and video (Youtube URL) to CourSys.

3. Poster Presentation, on-screen (20 points)

This is showtime! Make a poster to present your data product. Please make your poster look as professional as possible. Here are a few things that you can put on the poster (10 points):

  • Why do you do this project?

  • What questions do you try to answer?

  • What's your methodology to get the answers?

  • What datasets/tools do you use?

  • What's your data-science pipeline like?

  • Why is your solution good? Why does your result make sense?

  • What's your data product?

  • What have you learned through the project?

  • What do you plan to do if you have more time?

Design tips:

  • https://www.brightcarbon.com/blog/effective-academic-posters-powerpoint/
  • Use high-quality images that help draw attention and convey important information.
  • Keep it simple and uncluttered, use white space, limit text and images, and avoid distracting design elements.
  • Use a clear and concise title to convey the topic of your project.
  • Organize your content logically: Your poster should be organized in a way that is easy to follow and understand. Use headings, subheadings, and bullet points to guide the reader.
  • Use a hierarchy of font sizes with a font that's easy-to-read from a distance, sans-serif works well.

Poster Format and Presentation

The maximum poster sizes are: Landscape (36" x 24") and portrait versions (24" x 36"). You can opt for either of the two. We encourage you to use the template, but a custom design is also possible if you prefer.

During the presentation session (10 points), you will be given 8 mins to present your project. Please utilize at least 6 minutes of that, but do not exceed 8 minutes. TAs and instructors will ask a few questions after your presentation.

Submission

The project presentations are scheduled on Tuesday April 9th, 2023, 9 am - 1:30 pm. Please upload your poster to CourSys before 8:00 AM. Check your time slot in the tentative final presentation schedule.

Also, note that you do NOT need to bring your poster to the poster session on April 9th, as display will be digital.

However, please do bring a printed version to CS Industry Day on April 25th.

4. Video & Code & Report (30 points)

Code (10 points)

Like CMPT 732, you must use a Git repository for your project. Consider to use SFU's GitHub server is a good way to get one (instructions at that link). Group members must commit their own contributions to the repo. You are encouraged to publicize and open-source your work on GitHub or similar.

In your repository, please include a file README.txt (or README.md if you prefer) indicating how we can actually test your project as well as other notes about things we should look for. If you created some kind of web frontend, please include a URL in the README.md as well.

Report (10 points)

You need to submit a report giving an overview of your project. The report should have at least 2500 words with the following structure:

  • Project Title: Come up with an attractive project title (see this page for some tips);
  • Motivation and Background: Who cares about this project? Any related work?
  • Problem Statement: What questions do you want to answer? Why are they challenging?
  • Data Science Pipeline: What's your data-science pipeline like? Describe each component in detail.
  • Methodology: What tools or analysis methods did you use? Why did you choose them? How did you apply them to tackling each problem?
  • Evaluation: Why is your solution good? Why does your result make sense?
  • Data Product: What's your data product? Please demonstrate how it works.
  • Lessons Learnt: What did you learn from this project?
  • Summary: A high-level summary of your project. It should be self-contained and cover all the important aspects of your project.

Please choose A or B:

Video (10 points)

Please make an attractive video to introduce your project. Here are some requirements:

  1. The video length should 3 mins for ideal viewer attention, but can be up to 6 minutes.
  2. Explain why this is an important project
  3. List the questions you want to answer as well as the datasets you collected
  4. Give a high-level idea on how you use data science skills to answer those questions
  5. Need to show the conclusion of your project
  6. Put your contact information at the end of the video

You can get some inspirations from KDD 2017 Promotional Videos, KDD 2018 Promotional Videos, 2018 Project Showcase, and 2019 Project Showcase.

5. School of Computing Science Industry Day - April 25th

Present your poster at the CS Industry day, on April 25th, from 10 - 4 pm with poster session from 2:30 pm - 4 pm, in Saywell Hall.

CS Industry Day will host industry experts from different companies for tech talks, panel and round-table discussions. You also will be able to shine with our poster session.

Submission

We will create a web page on the course website and put your projects there. On the page, we will put a project title, a project summary, and the three URLs that link to your codebase, video, and final report. Please submit your project title, project summary, final report (Medium URL or PDF), code (Github/GitLab URL), and video (Youtube URL) to CourSys.

Updated Thu April 04 2024, 10:48 by sbergner.