The document provides an outline for a tutorial on neural network-based models for semantic composition. It will begin with an introduction to traditional approaches and then focus on distributed representation models that jointly learn word representations and composition operations from data. The tutorial will cover convolutional, recurrent and recursive composition architectures as well as unsupervised models and conclude with a discussion of limitations and future directions.
lected topics. We first cover the models that con- Learning representations to model the meaning sider compositional with non-compositional (e.g., of text has been a core problem in natural lan- holistically learned) semantics (Zhu et al., 2016, guage understanding (NLP). The last several years 2015a). Next, we discuss composition models have seen extensive interests on distributional ap- that integrate multiple architectures of neural net- proaches, in which text spans of different gran- works. We also discuss semantic composition ularities are encoded as continuous vectors. If and decomposition (Turney, 2014). In the end properly learned, such representations have been we briefly discuss sub-word neural-network-based shown to help achieve the state-of-the-art perfor- composition models (Zhang et al., 2015; Sennrich mances on a variety of NLP problems. et al., 2016) In this tutorial, we will cover the fundamentals We will then summarize the tutorial, flesh out and selected research topics on neural network- limitations of current approaches, and discuss fu- based modeling for semantic composition, which ture directions that are interesting to us. aims to learn distributed representations for larger spans of text, e.g., phrases (Yin and Schütze, 2014) 2 Tutorial Outline and sentences (Zhu et al., 2016; Chen et al., 2016; Zhu et al., 2015b,a; Tai et al., 2015; Kalchbrenner • Introduction et al., 2014; Irsoy and Cardie, 2014; Socher et al., ◦ Definition of semantic composition 2012), from the meaning representations of their ◦ Conventional and basic approaches parts, e.g., word embedding. Formal semantics
logic-based formal semantic approaches and sim- • Parametrising Composition Functions ple arithmetic operations over vectors based on ◦ Convolutional composition models corpus word counts (Mitchell and Lapata, 2008; ◦ Recurrent composition models Landauer and Dumais, 1997). ◦ Recursive composition models Our main focus, however, will be on distributed TreeRNN/TreeLSTM representation-based modeling, whereby the rep- SPINN and RL-SPINN resentations of words and the operations com- ◦ Unsupervised models posing them are jointly learned from a training Skip-thought vectors and paragraph objective. We cover the generic ideas behind vectors neural network-based semantic composition and Variational auto-encoders for text dive into the details of three typical composi- tion architectures: the convolutional composition • Selected Topics models (Kalchbrenner et al., 2014; Zhang et al., ◦ Incorporating compositional and non- 2015), recurrent composition models (Zhu et al., compositional (e.g., holistically learned) 2016), and recursive composition models (Irsoy semantics and Cardie, 2014; Socher et al., 2012; Zhu et al., ◦ Integrating multiple composition archi- 2015b; Tai et al., 2015). After that, we will tectures discuss several unsupervised approaches (Le and ◦ Semantic composition and decomposition Mikolov, 2014; Kiros et al., 2014; Bowman et al., ◦ Sub-word composition models 2016; Miao et al., 2016). • Summary Xiaodan Zhu, Researcher, National Research Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neural Council Canada. variational inference for text processing. In ICML. Jeff Mitchell and Mirella Lapata. 2008. Vector-based models of semantic composition. In ACL. Xiaodan Zhu is a Research Officer at the National Rico Sennrich, Barry Haddow, and Alexandra Birch. Research Council Canada. His research interests 2016. Neural machine translation of rare words with are in Natural Language Processing and Machine subword units. In ACL. Learning. His recent work has focused on deep Richard Socher, Brody Huval, Christopher D. Man- learning, semantic composition, sentiment analy- ning, and Andrew Y. Ng. 2012. Semantic composi- sis, and natural language inference. Xiaodan has tionality through recursive matrix-vector spaces. In taught a tutorial at EMNLP ’14. EMNLP. Kai Sheng Tai, Richard Socher, and Christopher D. Edward Grefenstette, Senior Research Scientist, Manning. 2015. Improved Semantic Representa- DeepMind. tions From Tree-Structured Long Short-Term Mem- ory Networks. In ACL. Peter Turney. 2014. Semantic composition and de- Edward Grefenstette is a Senior Research Scientist composition: From recognition to generation. In at DeepMind. His research covers the intersection arXiv:1405.7908. of Machine Learning, Computer Reasoning, and Wenpeng Yin and Hinrich Schütze. 2014. An explo- Natural Language Understanding. Recent publica- ration of embeddings for generalized phrases. In tions cover the topics of neural computation, rep- ACL 2014 Student Research Workshop. resentation learning at the sentence level, recog- Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. nising textual entailment, and machine reading. Character-level convolutional networks for text clas- sification. In NIPS. Xiaodan Zhu, Hongyu Guo, and Parinaz Sobhani. References 2015a. Neural networks for integrating composi- Samuel R. Bowman, Jon Gauthier, Abhinav Ras- tional and non-compositional sentiment in sentiment togi, Raghav Gupta, Christopher D. Manning, and composition. In *SEM. Christopher Potts. 2016. A fast unified model for parsing and sentence understanding. In ACL. Xiaodan Zhu, Parinaz Sobhani, and Hongyu Guo. 2015b. Long short-term memory over recursive Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, and structures. In ICML. Hui Jiang. 2016. Enhancing and combining sequen- Xiaodan Zhu, Parinaz Sobhani, and Hongyu Guo. tial and tree lstm for natural language inference. In 2016. Dag-structured long short-term memory for arXiv:1609.06038v1. semantic compositionality. In NAACL. Ozan Irsoy and Claire Cardie. 2014. Deep recursive neural networks for compositionality in language. In NIPS.
