Local Response Normalization

A trick that is introduced in  CNN is local response normalization, which is always used immediately after the ReLU layer. The use of this trick aids generalization. The basic idea of this normalization approach is inspired from biological principles, and it is intended to create competition among different filters. First, we describe the normalization formula using all filters, and then we describe how it is actually computed using only a subset of filters. Consider a situation in which a layer contains $N$ filters, and the activation values of these $N$ filters at a particular spatial position $(x, y)$ are given by $a_1 . . . a_N$. Then, each $a_i$ is converted into a normalized value $b_i$ using the following formula:

$b_i=\frac{a_i}{(k+\alpha \sum _j a_i^2)^\beta}$

The values of the underlying parameters used in the original paper  are $k = 2, α = 10^{−4}$, and $β = 0.75$. However, in practice, one does not normalize over all $N$ filters. Rather the filters are ordered arbitrarily up front to define “adjacency” among filters. Then, the normalization is performed over each set of $n$ “adjacent” filters for some parameter $n$. The value of $n$ used  is 5. Therefore, we have the following formula:

$b_i=\frac{a_i}{(k+\alpha \sum _{\lfloor i-n/2 \rfloor}^{\lfloor i+n/2 \rfloor} a_i^2)^\beta}$

In the above formula, any value of $i − n/2$ that is less than 0 is set to 0, and any value of  $i + n/2$ that is greater than $N$ is set to $N$.

Comments

Popular posts from this blog

NEURAL NETWORKS AND DEEP LEARNING CST 395 CS 5TH SEMESTER HONORS COURSE NOTES - Dr Binu V P, 9847390760

Syllabus CST 395 Neural Network and Deep Learning

Introduction to neural networks