Recursive Neural Network


Recursive neural networks represent yet another generalization of recurrent networks, with a different kind of computational graph, which is structured as a deep tree, rather than the chain-like structure of RNNs. The typical computational graph for a recursive network is illustrated in figure below. Recursive neural networks were introduced by Pollack (1990) and their potential use for learning to reason was described by Bottou (2011). Recursive networks have been successfully applied to processing data structures as input to neural nets (Frasconi et al., 1997, 1998), in natural language processing (Socher et al., 2011a,c, 2013a) as well as in computer vision (Socher et al., 2011b).



One clear advantage of recursive nets over recurrent nets is that for a sequence of the same length $τ$, the depth (measured as the number of compositions of nonlinear operations) can be drastically reduced from $τ$ to $O(log τ )$, which might help deal with long-term dependencies. An open question is how to best structure the tree. One option is to have a tree structure which does not depend on the data, such as a balanced binary tree. In some application domains, external methods can suggest the appropriate tree structure. For example, when processing natural language sentences, the tree structure for the recursive network can be fixed to the structure of the parse tree of the sentence provided by a natural language parser (Socher et al., 2011a, 2013a). Ideally, one would like the learner itself to discover and infer the tree structure that is appropriate for any given input, as suggested by Bottou (2011).

A Recursive Neural Network is used for sentiment analysis in natural language sentences. It is one of the most important tasks of Natural language Processing (NLP), which identifies the writing tone and sentiments of the writer in a particular sentence. If a writer expresses any sentiment, basic labels about the writing tone are recognized. We want to identify the smaller components like nouns or verb phrases and order them in a syntactic hierarchy. For example, it identifies whether the sentence showcases a constructive form of writing or negative word choices.

A variable called 'score' is calculated at each traversal of nodes, telling us which pair of phrases and words we must combine to form the perfect syntactic tree for a given sentence.

Let us consider the representation of the phrase -- "a lot of fun" in the following sentence.

Programming is a lot of fun.

An RNN representation of this phrase would not be suitable because it considers only sequential relations. Each state varies with the preceding words' representation. So, a subsequence that doesn't occur at the beginning of the sentence can't be represented. With RNN, when processing the word 'fun,' the hidden state will represent the whole sentence.

However, with a Recursive Neural Network (RvNN), the hierarchical architecture can store the representation of the exact phrase. It lies in the hidden state of the node R_{a\ lot\ of\ fun}. Thus, Syntactic parsing is completely implemented with the help of Recursive Neural Networks.

Benefits of RvNNs for Natural Language Processing
The two significant advantages of Recursive Neural Networks for Natural Language Processing are their structure and reduction in network depth.

As already explained, the tree structure of Recursive Neural Networks can manage hierarchical data like in parsing problems.
 
Another benefit of RvNN is that the trees can have a logarithmic height. When there are O(n) input words, a Recursive Neural Network can represent a binary tree with height O(log\ n). This lessens the distance between the first and last input elements. Hence, the long-term dependency turns shorter and easier to grab.

Disadvantages of RvNNs for Natural Language Processing
The main disadvantage of recursive neural networks can be the tree structure. Using the tree structure indicates introducing a unique inductive bias to our model. The bias corresponds to the assumption that the data follow a tree hierarchy structure. But that is not the truth. Thus, the network may not be able to learn the existing patterns.

Another disadvantage of the Recursive Neural Network is that sentence parsing can be slow and ambiguous. Interestingly, there can be many parse trees for a single sentence.

Also, it is more time-consuming and labor-intensive to label the training data for recursive neural networks than to construct recurrent neural networks. Manually parsing a sentence into short components is more time-consuming and tedious than assigning a label to a sentence.

Comments

Popular posts from this blog

NEURAL NETWORKS AND DEEP LEARNING CST 395 CS 5TH SEMESTER HONORS COURSE NOTES - Dr Binu V P, 9847390760

Syllabus CST 395 Neural Network and Deep Learning

Introduction to neural networks