Deep Learning Course (980)¶

Assignment Two: Convolutional Neural Networks¶

Assignment Goals:

Design and implementation of CNNs.
Understanding the different effects of linear and nonlinear activation functions.
CNN visualization.

In this assignment, you will be asked to learn a CNN model for an image dataset. Different experiments will help you achieve a better understanding of CNNs.

DataSet: the dataset consists of around 9K images (some grayscale and some RGB) belonging to 101 classes. The shape of each image is (64,64,3). Every image is labeled with one of the classes. The image file is contained in the folder named after the class name.

We are aiming to lean a CNN on the dataset. Download the dataset, and use TensorFlow to implement LeNet5 to classify instances. Use a one-hot encoding for labels. Split the dataset into training (90 percent) and validation (10 percent) and report the model loss (cross-entropy) and accuracy on both sets. Try to improve model accuracy on the validation dataset by tuning the model hyperparameters. You can use regularization to prevent overfitting. The model performance is a part of the overall evaluation (35 points). The LeNet5 configuration is:
- Convolutional layer (kernel size 5 x 5, 32 filters, stride 1 x 1 and followed by ReLU)
- Max Pooling layer subsampling by 4 and stride 4 x 4
- Convolutional layer (kernel size 5 x 5, 64 filters, stride 1 x 1 and followed by ReLU)
- Max Pooling layer subsampling by 4 and stride 4 x 4
- Fully Connected ReLU layer that has input 7764 and output 1024
- Fully Connected ReLU layer that has input 1024 and output 84
- Fully Connected Softmax layer that has input 84 and output which is equal to the number of classes (one node for each of the classes).
What happens if we use a linear activation function in all convolutions and dense layers (except softmax in the last fully connected layer)? Compare training and validation loss (cross-entropy) after and before changing the activation function. Can we compensate for the effect of removing the non-linear activation function by adding more convolution linear layers? Explain your answer. (20 points)
There are several approaches to understand and visualize convolutional Networks, including visualizing the activations and layers weights. The most straight-forward visualization technique is to show the activations of the network during the forward pass. The second most common strategy is to visualize the weights. The weights are useful to visualize because well-trained networks usually display nice and smooth filters without any noisy patterns. Please visualize the filters (i.e., the first layer convolution weights) which your CNN has learned for this task. Use the trained non-linear CNN model which you implemented in section 1. (Reference: for more information we recommend the course notes on "Visualizing what ConvNets learn" http://cs231n.github.io/understanding-cnn/. More advanced techniques can be found in "Visualizing and Understanding Convolutional Networks" paper by Matthew D.Zeiler and Rob Fergus.) (35 points)

NOTE: Please use Jupyter Notebook. The notebook should include the final code, results and your answers. You should submit your Notebook in (.pdf or .html) and .ipynb format. (10 points)

Instructions:

The university policy on academic dishonesty and plagiarism (cheating) will be taken very seriously in this course. Everything submitted should be your own writing or coding. You must not let other students copy your work. Spelling and grammar count.

Your assignments will be marked based on correctness, originality (the implementations and ideas are from yourself), clarification and test performance.