Convolutional Neural Networks

Neural Networks that integrate convolutions as the processing operations to be used on input images instead of linear ones.

The NN will decide on which kernel values (image processing operations) to use in the convolution operations run in both parallel and sequence.

In building a CNN for classification, the initial layers will be built with Conv2d operations, and the final layers must be linear layers. This is because the final output must consist of probabilities.

There are a few non-trainable layers that were invented for CNNs in order to improve robustness, efficiency and prevent overfitting.

We can also increase the size of the training dataset by applying various transformations to the existing data through Data augmentation.

With all non-trainable layers

class MNIST_CNN_all(nn.Module):
    def __init__(self):
        super(MNIST_CNN_all, self).__init__()
        # Two convolutional layers
        self.conv1 = nn.Conv2d(1, 32, kernel_size = 3, stride = 1, padding = 1)
        self.conv2 = nn.Conv2d(32, 64, kernel_size = 3, stride = 1, padding = 1)
        # Two fully connected layers
        self.fc1 = nn.Linear(64*28*28, 128) # 64*28*28 = 50176
        self.fc2 = nn.Linear(128, 10)
        # Batch normalization layers
        self.batch_norm1 = nn.BatchNorm2d(32)
        self.batch_norm2 = nn.BatchNorm2d(64)
        self.batch_norm3 = nn.BatchNorm1d(128)
        # Dropout layers
        self.dropout1 = nn.Dropout2d(0.25)
        self.dropout2 = nn.Dropout2d(0.5)
        self.dropout3 = nn.Dropout(0.25)
        # MaxPool Layers
        self.maxpool2d = F.max_pool2dd

	def forward(self, x):
	    # Pass input through first convolutional Layer
	    x = self.conv1(x)
	    x = F.relu(x)
	    x = self.batch_norm1(x)
	    x = self.dropout1(x)
	    # Pass output of first conv Layer through second convolutional Layer
	    # Pooling only once on second Layer (we could also do it on the first one)
	    x = self.conv2(x)
	    x = F.relu(x)
	    x = self.batch_norm2(x)
	    x = self.dropout2(x)
	    x = self.maxpool2d(x, 2)
	    # Flatten output of second conv Layer
	    x = x.view(-1, 64*14*14)
	    # Pass flattened output through first fully connected Layer
	    x = self.fc1(x)
	    x = F.relu(x)
	    x = self.batch_norm3(x)
	    x = self.dropout3(x)
	    # Pass output of first fully connected layer through second fully connected layer
	    x = self.fc2(x)
	    return x

model = MNIST_CNN_all()

for inputs, labels in train_loader:
    out = model(inputs)
    print(out.shape)
    print(labels.shape)
    break

A quick history on remarkable CV models