Posts

Showing posts from October, 2021

Recursive Neural Network

Image
Recursive neural networks represent yet another generalization of recurrent networks, with a different kind of computational graph, which is structured as a deep tree, rather than the chain-like structure of RNNs. The typical computational graph for a recursive network is illustrated in figure below. Recursive neural networks were introduced by Pollack (1990) and their potential use for learning to reason was described by Bottou (2011). Recursive networks have been successfully applied to processing data structures as input to neural nets (Frasconi et al., 1997, 1998), in natural language processing (Socher et al., 2011a,c, 2013a) as well as in computer vision (Socher et al., 2011b). One clear advantage of recursive nets over recurrent nets is that for a sequence of the same length $τ$, the depth (measured as the number of compositions of nonlinear operations) can be drastically reduced from $τ$ to $O(log τ )$, which might help deal with long-term dependencies. An open question is h

Multilayer Recurrent Networks ( Deep Recurrent Network)

Image
In all the aforementioned applications, a single-layer RNN architecture is used for ease in understanding. However, in practical applications, a multilayer architecture is used in order to build models of greater complexity. Furthermore, this multilayer architecture can be used in combination with advanced variations of the RNN, such as the LSTM architecture or the gated recurrent unit. These advanced architectures are introduced in later sections. An example of a deep network containing three layers is shown in Figure below. Note that nodes in higher-level layers receive input from those in lower-level layers. The relationships among the hidden states can be generalized directly from the single-layer network. First, we rewrite the recurrence equation of the hidden layers (for single-layer networks) in a form that can be adapted easily to multilayer networks: Here, we have put together a larger matrix $W = [W_{xh},W_{hh}]$ that includes the columns of $W_{xh}$ and $W_{hh}$. Similarl

Introduction to Recurrent Neural Network

Image
Recurrent neural networks or RNNs (Rumelhart et al., 1986a) are a family of neural networks for processing sequential data. Much as a convolutional network is a neural network that is specialized for processing a grid of values $X$ such as an image, a recurrent neural network is a neural network that is specialized for processing a sequence of values $x^{(1)}, . . . , x^{(τ)}$ . Just as convolutional networks can readily scale to images with large width and height, and some convolutional networks can process images of variable size, recurrent networks can scale to much longer sequences than would be practical for networks without sequence-based specialization. Most recurrent networks can also process sequences of variable length. All the neural architectures discussed earlier are inherently designed for multidimensional data in which the attributes are largely independent of one another. However, certain data types such as time-series, text, and biological data contain sequential de