Here are 3 Convolution Neural Network(CNN) Layers , which helps you to understand CNN.

Farooq Ahmed
4 min readJul 14, 2022

In this blog you will learn about convolution neural network .

What is Convolution neural Network?

Convolution neural network are more often utilized for classification and computer vision tasks. Prior to CNN manual, time consuming feature extraction method were used to identify objects in images. However convolution neural network now provide a more scalable approach to image classification and object recognition task.

they can be used or computationally demanding require GPU to train model.

They are 3 types of layers which are

Convolution Layer

Pooling Layer

Fully Connected Layer

the convolution layer is been followed by pooling layer , the fully connected layer is the final layer. With each layer the CNN increases in its complexity , identifying greater problems of the image earlier layer focus on simple feature such as color and edges .

As the image data progresses through the leaves of the CNN , it start to recognize larger element or shape of the object until it finally identify the intend object.

Convolution Layer

Convolution is basically filtering the image with a smaller pixel filter to decrease the size of the image without losing the relationship between pixels.

When we apply convolution to a 5x5 image by using a 3x3 filter with 1x1 stride we will end up having a 3x3 output (64% decrease in complexity ).

It require a few component which are inputs data a filter and a future map, lets assume that input will be color image which is made up of matrix of pixels in 3D.

This means that the inputs will have 3D a height width and depth with corresponding to RGB in an image.

we also have a future detector also known as a kernel or a filter which will move across the respective fields of the image checking if the feature is present . This process is known a convolution .

Pooling Layer

Pooling layers, also known as down sampling, conducts dimensionality reduction, reducing the number of parameters in the input.

Similar to the convolutional layer, the pooling operation sweeps a filter across the entire input, but the difference is that this filter does not have any weights.

Instead, the kernel applies an aggregation function to the values within the receptive field, populating the output array.

There are two main types of pooling:

Max pooling: As the filter moves across the input, it selects the pixel with the maximum value to send to the output array. As an aside, this approach tends to be used more often compared to average pooling.

Average pooling: As the filter moves across the input, it calculates the average value within the receptive field to send to the output array.

They help to reduce complexity, improve efficiency, and limit risk of over fitting.

Max Pooling

Max-pooling simply takes the maximum number in regions of images.

Typically for pooling, we consider filters with size 2 by 2, and we use stride equals to 2. In this example, we first consider the pink part (the first 2 by 2 region) containing numbers 3, 1, 6 and 0, taking its maximum value, number 6.

Then we jump 2 places (stride is set to 2) in the green part, taking the number 9 as the maximum. S

Similarly, we do for the lower part of the image, taking the maximum values 3 and 4.

If convolutional layers were learning the features, pooling layers actually do feature selection, selecting the strongest activations in the feature map, and making the learning subject to translation invariance (by considering only the largest values in patches of the image, we make learning invariant to small shifting/translation).

Average-pooling is another important pooling operator which is used typically in deep networks, in the later stages of them. It is very similar to max-pooling: instead of taking the maximum value in a patch of the image, it takes the average value.

Fully connected Layer

In CNN, the pixel values of the input image are not directly connected in the output layer in partially co0nnected layer. However , in the fully connected layer each node in the output layer connects directly to a node in the previous layer .

While Convolution and pooling layers tend to use RelU functions, FC layers usually leverage a softmax activation function to classify inputs appropriately producing a probability from 0 to 1.

Extra Notes

in_channels(int)means number of channel in output

Default value of paddling is 0

there are 2 ways of using convolution in Pytorch.In an object oriented way as part of torch.nn module or in a functional every as part of torch.nn functional.

--

--

Farooq Ahmed

Sharing interesting articles about Tech, Linux, Data analytics