Posts

Showing posts from April, 2022

Motivation

Image
Convolution leverages three important ideas that can help improve a machine learning system: sparse interactions, parameter sharing and equivariant representations . Moreover, convolution provides a means for working with inputs of variable size. We now describe each of these ideas in turn. Traditional neural network layers use matrix multiplication by a matrix of parameters with a separate parameter describing the interaction between each input unit and each output unit. This means every output unit interacts with every input unit.Convolutional networks, however, typically have sparse interactions (also referred to as sparse connectivity or sparse weights). This is accomplished by making the kernel smaller than the input. For example, when processing an image, the input image might have thousands or millions of pixels, but we can detect small, meaningful features such as edges with kernels that occupy only tens or hundreds of pixels. This means that we need to store fewer parameter...

Padding

Image
One observation is that the convolution operation reduces the size of the $(q + 1)$th layer  in comparison with the size of the $q$th layer. This type of reduction in size is not desirable  in general, because it tends to lose some information along the borders of the image (or  of the feature map, in the case of hidden layers). This problem can be resolved by using  padding. In padding, one adds $(F_q −1)/2$ “pixels” all around the borders of the feature map  in order to maintain the spatial footprint. Note that these pixels are really feature values  in the case of padding hidden layers. The value of each of these padded feature values is set  to 0, irrespective of whether the input or the hidden layers are being padded. As a result, the spatial height and width of the input volume will both increase by $(F_q − 1)$, which is  exactly what they reduce by (in the output volume) after the convolution is performed. The  padded portions do ...