Improved semantic representations from tree-structured long short-term memory networks

KS Tai, R Socher, CD Manning - arXiv preprint arXiv:1503.00075, 2015 - arxiv.org
arXiv preprint arXiv:1503.00075, 2015arxiv.org
Because of their superior ability to preserve sequence information over time, Long Short-
Term Memory (LSTM) networks, a type of recurrent neural network with a more complex
computational unit, have obtained strong results on a variety of sequence modeling tasks.
The only underlying LSTM structure that has been explored so far is a linear chain.
However, natural language exhibits syntactic properties that would naturally combine words
to phrases. We introduce the Tree-LSTM, a generalization of LSTMs to tree-structured …
Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks. The only underlying LSTM structure that has been explored so far is a linear chain. However, natural language exhibits syntactic properties that would naturally combine words to phrases. We introduce the Tree-LSTM, a generalization of LSTMs to tree-structured network topologies. Tree-LSTMs outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task 1) and sentiment classification (Stanford Sentiment Treebank).
arxiv.org