Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–27 of 27 results for author: Smolensky, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.01460  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Implicit Chain of Thought Reasoning via Knowledge Distillation

    Authors: Yuntian Deng, Kiran Prasad, Roland Fernandez, Paul Smolensky, Vishrav Chaudhary, Stuart Shieber

    Abstract: To augment language models with the ability to reason, researchers usually prompt or finetune them to produce chain of thought reasoning steps before producing the final answer. However, although people use natural language to reason effectively, it may be that LMs could reason more effectively with some intermediate computation that is not in natural language. In this work, we explore an alternat… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  2. arXiv:2306.00751  [pdf, other

    cs.CL cs.LG

    Differentiable Tree Operations Promote Compositional Generalization

    Authors: Paul Soulos, Edward Hu, Kate McCurdy, Yunmo Chen, Roland Fernandez, Paul Smolensky, Jianfeng Gao

    Abstract: In the context of structure-to-structure transformation tasks, learning sequences of discrete symbolic operations poses significant challenges due to their non-differentiability. To facilitate the learning of these symbolic sequences, we introduce a differentiable tree interpreter that compiles high-level symbolic tree operations into subsymbolic matrix operations on tensors. We present a novel Di… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: ICML 2023. Code available at https://github.com/psoulos/dtm

  3. arXiv:2212.10769  [pdf, other

    cs.CL

    Uncontrolled Lexical Exposure Leads to Overestimation of Compositional Generalization in Pretrained Models

    Authors: Najoung Kim, Tal Linzen, Paul Smolensky

    Abstract: Human linguistic capacity is often characterized by compositionality and the generalization it enables -- human learners can produce and comprehend novel complex expressions by composing known parts. Several benchmarks exploit distributional control across training and test to gauge compositional generalization, where certain lexical items only occur in limited contexts during training. While rece… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: Preprint

  4. arXiv:2208.06061  [pdf, other

    cs.CL

    Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages

    Authors: Paul Soulos, Sudha Rao, Caitlin Smith, Eric Rosen, Asli Celikyilmaz, R. Thomas McCoy, Yichen Jiang, Coleman Haley, Roland Fernandez, Hamid Palangi, Jianfeng Gao, Paul Smolensky

    Abstract: Machine translation has seen rapid progress with the advent of Transformer-based models. These models have no explicit linguistic structure built into them, yet they may still implicitly learn structured relationships by attending to relevant tokens. We hypothesize that this structural learning could be made more robust by explicitly endowing Transformers with a structural bias, and we investigate… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

    Comments: Revised edition to 4th Workshop on Technologies for MT of Low Resource Languages

    Journal ref: Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021)

  5. arXiv:2205.01128  [pdf, other

    cs.AI cs.NE cs.SC

    Neurocompositional computing: From the Central Paradox of Cognition to a new generation of AI systems

    Authors: Paul Smolensky, R. Thomas McCoy, Roland Fernandez, Matthew Goldrick, Jianfeng Gao

    Abstract: What explains the dramatic progress from 20th-century to 21st-century AI, and how can the remaining limitations of current AI be overcome? The widely accepted narrative attributes this progress to massive increases in the quantity of computational and data resources available to support statistical learning in deep artificial neural networks. We show that an additional crucial factor is the develo… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

    Comments: 21 pages, 6 figures. For a general AI audience: to appear in AI Magazine. A more extensive presentation of this work is "Neurocompositional computing in human and machine intelligence: A tutorial", Microsoft Technical Report MSR-TR-2022-5; see https://www.microsoft.com/en-us/research/publication/neurocompositional-computing-in-human-and-machine-intelligence-a-tutorial/

  6. arXiv:2111.09509  [pdf, other

    cs.CL

    How much do language models copy from their training data? Evaluating linguistic novelty in text generation using RAVEN

    Authors: R. Thomas McCoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz

    Abstract: Current language models can generate high-quality text. Are they simply copying text they have seen before, or have they learned generalizable linguistic abstractions? To tease apart these possibilities, we introduce RAVEN, a suite of analyses for assessing the novelty of generated text, focusing on sequential structure (n-grams) and syntactic structure. We apply these analyses to four neural lang… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

    Comments: 10 pages, plus 39 pages of appendices

  7. arXiv:2110.12342  [pdf, other

    cs.CL

    Distributed neural encoding of binding to thematic roles

    Authors: Matthias Lalisse, Paul Smolensky

    Abstract: A framework and method are proposed for the study of constituent composition in fMRI. The method produces estimates of neural patterns encoding complex linguistic structures, under the assumption that the contributions of individual constituents are additive. Like usual techniques for modeling compositional structure in fMRI, the proposed method employs pattern superposition to synthesize complex… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

    Comments: Originally presented as a poster MACSIM 8 (2019)

  8. arXiv:2110.12341  [pdf, other

    cs.CL

    Scalable knowledge base completion with superposition memories

    Authors: Matthias Lalisse, Eric Rosen, Paul Smolensky

    Abstract: We present Harmonic Memory Networks (HMem), a neural architecture for knowledge base completion that models entities as weighted sums of pairwise bindings between an entity's neighbors and corresponding relations. Since entities are modeled as aggregated neighborhoods, representations of unseen entities can be generated on the fly. We demonstrate this with two new datasets: WNGen and FBGen. Experi… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

  9. arXiv:2106.01317  [pdf, other

    cs.CL cs.AI cs.LG

    Enriching Transformers with Structured Tensor-Product Representations for Abstractive Summarization

    Authors: Yichen Jiang, Asli Celikyilmaz, Paul Smolensky, Paul Soulos, Sudha Rao, Hamid Palangi, Roland Fernandez, Caitlin Smith, Mohit Bansal, Jianfeng Gao

    Abstract: Abstractive summarization, the task of generating a concise summary of input documents, requires: (1) reasoning over the source document to determine the salient pieces of information scattered across the long document, and (2) composing a cohesive text by reconstructing these salient facts into a shorter summary that faithfully reflects the complex relations connecting these facts. In this paper,… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: NAACL 2021 (14 pages)

  10. arXiv:2105.08961  [pdf, other

    cs.LG cs.AI cs.CL

    Compositional Processing Emerges in Neural Networks Solving Math Problems

    Authors: Jacob Russin, Roland Fernandez, Hamid Palangi, Eric Rosen, Nebojsa Jojic, Paul Smolensky, Jianfeng Gao

    Abstract: A longstanding question in cognitive science concerns the learning mechanisms underlying compositionality in human cognition. Humans can infer the structured relationships (e.g., grammatical rules) implicit in their sensory observations (e.g., auditory speech), and use this knowledge to guide the composition of simpler meanings into complex wholes. Recent progress in artificial neural networks has… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: 7 pages, 2 figures, Accepted to CogSci 2021 for poster presentation

  11. arXiv:2011.09530  [pdf, other

    cs.CV cs.AI eess.IV

    Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language

    Authors: Hassan Akbari, Hamid Palangi, Jianwei Yang, Sudha Rao, Asli Celikyilmaz, Roland Fernandez, Paul Smolensky, Jianfeng Gao, Shih-Fu Chang

    Abstract: Neuro-symbolic representations have proved effective in learning structure information in vision and language. In this paper, we propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning. Our approach uses a dictionary learning-based method of learning relations between videos and their paired text descriptions. We refer to these relations as rel… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  12. arXiv:2006.16324  [pdf, other

    cs.CL cs.LG

    Universal linguistic inductive biases via meta-learning

    Authors: R. Thomas McCoy, Erin Grant, Paul Smolensky, Thomas L. Griffiths, Tal Linzen

    Abstract: How do learners acquire languages from the limited data available to them? This process must involve some inductive biases - factors that affect how a learner generalizes - but it is unclear which inductive biases can explain observed patterns in language acquisition. To facilitate computational modeling aimed at addressing this question, we introduce a framework for giving particular linguistic i… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: To appear in the Proceedings of the 42nd Annual Conference of the Cognitive Science Society

  13. arXiv:1910.12647  [pdf, other

    cs.CL cs.LG stat.ML

    HUBERT Untangles BERT to Improve Transfer across NLP Tasks

    Authors: Mehrad Moradshahi, Hamid Palangi, Monica S. Lam, Paul Smolensky, Jianfeng Gao

    Abstract: We introduce HUBERT which combines the structured-representational power of Tensor-Product Representations (TPRs) and BERT, a pre-trained bidirectional Transformer language model. We show that there is shared structure between different NLP datasets that HUBERT, but not BERT, is able to learn and leverage. We validate the effectiveness of our model on the GLUE benchmark and HANS dataset. Our exper… ▽ More

    Submitted 25 April, 2021; v1 submitted 25 October, 2019; originally announced October 2019.

  14. Discovering the Compositional Structure of Vector Representations with Role Learning Networks

    Authors: Paul Soulos, Tom McCoy, Tal Linzen, Paul Smolensky

    Abstract: How can neural networks perform so well on compositional tasks even though they lack explicit compositional representations? We use a novel analysis technique called ROLE to show that recurrent neural networks perform well on such tasks by converging to solutions which implicitly represent symbolic structure. This method uncovers a symbolic structure which, when properly embedded in vector space,… ▽ More

    Submitted 16 November, 2020; v1 submitted 20 October, 2019; originally announced October 2019.

  15. arXiv:1910.06611  [pdf, other

    cs.LG stat.ML

    Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving

    Authors: Imanol Schlag, Paul Smolensky, Roland Fernandez, Nebojsa Jojic, Jürgen Schmidhuber, Jianfeng Gao

    Abstract: We incorporate Tensor-Product Representations within the Transformer in order to better support the explicit representation of relation structure. Our Tensor-Product Transformer (TP-Transformer) sets a new state of the art on the recently-introduced Mathematics Dataset containing 56 categories of free-form math word-problems. The essential component of the model is a novel attention mechanism, cal… ▽ More

    Submitted 4 November, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

  16. arXiv:1910.02339  [pdf, other

    cs.CL cs.LG

    Mapping Natural-language Problems to Formal-language Solutions Using Structured Neural Representations

    Authors: Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Kenneth D. Forbus, Jianfeng Gao

    Abstract: Generating formal-language programs represented by relational tuples, such as Lisp programs or mathematical operations, to solve problems stated in natural language is a challenging task because it requires explicitly capturing discrete symbolic structural information implicit in the input. However, most general neural sequence models do not explicitly capture such structural information, limiting… ▽ More

    Submitted 1 August, 2020; v1 submitted 5 October, 2019; originally announced October 2019.

  17. arXiv:1812.08718  [pdf, other

    cs.CL

    RNNs Implicitly Implement Tensor Product Representations

    Authors: R. Thomas McCoy, Tal Linzen, Ewan Dunbar, Paul Smolensky

    Abstract: Recurrent neural networks (RNNs) can learn continuous vector representations of symbolic structures such as sequences and sentences; these representations often exhibit linear regularities (analogies). Such regularities motivate our hypothesis that RNNs that show such regularities implicitly compile symbolic structures into tensor product representations (TPRs; Smolensky, 1990), which additively c… ▽ More

    Submitted 5 March, 2019; v1 submitted 20 December, 2018; originally announced December 2018.

    Comments: Accepted to ICLR 2019

  18. arXiv:1811.01062  [pdf, other

    cs.CL

    Augmenting Compositional Models for Knowledge Base Completion Using Gradient Representations

    Authors: Matthias Lalisse, Paul Smolensky

    Abstract: Neural models of Knowledge Base data have typically employed compositional representations of graph objects: entity and relation embeddings are systematically combined to evaluate the truth of a candidate Knowedge Base entry. Using a model inspired by Harmonic Grammar, we propose to tokenize triplet embeddings by subjecting them to a process of optimization with respect to learned well-formedness… ▽ More

    Submitted 12 August, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

    Comments: 10 pages, 2 figures, To appear in proceedings of the Society for Computation in Linguistics (SCIL 2019)

  19. arXiv:1810.12456  [pdf, other

    cs.NE cs.LG cs.SC

    A Simple Recurrent Unit with Reduced Tensor Product Representations

    Authors: Shuai Tang, Paul Smolensky, Virginia R. de Sa

    Abstract: idely used recurrent units, including Long-short Term Memory (LSTM) and the Gated Recurrent Unit (GRU), perform well on natural language tasks, but their ability to learn structured representations is still questionable. Exploiting reduced Tensor Product Representations (TPRs) --- distributed representations of symbolic structure in which vector-embedded symbols are bound to vector-embedded struct… ▽ More

    Submitted 5 November, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

  20. arXiv:1809.07889  [pdf, other

    cs.CL

    Predicting the Argumenthood of English Prepositional Phrases

    Authors: Najoung Kim, Kyle Rawlins, Benjamin Van Durme, Paul Smolensky

    Abstract: Distinguishing between arguments and adjuncts of a verb is a longstanding, nontrivial problem. In natural language processing, argumenthood information is important in tasks such as semantic role labeling (SRL) and prepositional phrase (PP) attachment disambiguation. In theoretical linguistics, many diagnostic tests for argumenthood exist but they often yield conflicting and potentially gradient r… ▽ More

    Submitted 14 April, 2019; v1 submitted 20 September, 2018; originally announced September 2018.

    Comments: AAAI-19

  21. arXiv:1803.03834  [pdf, other

    cs.AI

    Learning and analyzing vector encoding of symbolic representations

    Authors: Roland Fernandez, Asli Celikyilmaz, Rishabh Singh, Paul Smolensky

    Abstract: We present a formal language with expressions denoting general symbol structures and queries which access information in those structures. A sequence-to-sequence network processing this language learns to encode symbol structures and query them. The learned representation (approximately) shares a simple linearity property with theoretical techniques for performing this task.

    Submitted 10 March, 2018; originally announced March 2018.

  22. arXiv:1801.03562  [pdf, ps, other

    cs.CL

    Discrete symbolic optimization and Boltzmann sampling by continuous neural dynamics: Gradient Symbolic Computation

    Authors: Paul Tupper, Paul Smolensky, Pyeong Whan Cho

    Abstract: Gradient Symbolic Computation is proposed as a means of solving discrete global optimization problems using a neurally plausible continuous stochastic dynamical system. Gradient symbolic dynamics involves two free parameters that must be adjusted as a function of time to obtain the global maximizer at the end of the computation. We provide a summary of what is known about the GSC dynamics for spec… ▽ More

    Submitted 4 January, 2018; originally announced January 2018.

    MSC Class: 49D10; 60J70; 91F20

  23. arXiv:1710.11475  [pdf, other

    cs.CL

    A Neural-Symbolic Approach to Design of CAPTCHA

    Authors: Qiuyuan Huang, Paul Smolensky, Xiaodong He, Li Deng, Dapeng Wu

    Abstract: CAPTCHAs based on reading text are susceptible to machine-learning-based attacks due to recent significant advances in deep learning (DL). To address this, this paper promotes image/visual captioning based CAPTCHAs, which is robust against machine-learning-based attacks. To develop image/visual-captioning-based CAPTCHAs, this paper proposes a new image captioning architecture by exploiting tensor… ▽ More

    Submitted 25 September, 2018; v1 submitted 29 October, 2017; originally announced October 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1709.09118

  24. arXiv:1709.09118  [pdf, other

    cs.CV cs.CL

    Tensor Product Generation Networks for Deep NLP Modeling

    Authors: Qiuyuan Huang, Paul Smolensky, Xiaodong He, Li Deng, Dapeng Wu

    Abstract: We present a new approach to the design of deep networks for natural language processing (NLP), based on the general technique of Tensor Product Representations (TPRs) for encoding and processing symbol structures in distributed neural networks. A network architecture --- the Tensor Product Generation Network (TPGN) --- is proposed which is capable in principle of carrying out TPR computation, but… ▽ More

    Submitted 16 December, 2017; v1 submitted 26 September, 2017; originally announced September 2017.

  25. arXiv:1705.08432  [pdf, other

    cs.CL

    Question-Answering with Grammatically-Interpretable Representations

    Authors: Hamid Palangi, Paul Smolensky, Xiaodong He, Li Deng

    Abstract: We introduce an architecture, the Tensor Product Recurrent Network (TPRN). In our application of TPRN, internal representations learned by end-to-end optimization in a deep neural network performing a textual question-answering (QA) task can be interpreted using basic concepts from linguistic theory. No performance penalty need be paid for this increased interpretability: the proposed model perfor… ▽ More

    Submitted 25 September, 2017; v1 submitted 23 May, 2017; originally announced May 2017.

  26. arXiv:1601.02745  [pdf

    cs.AI

    Basic Reasoning with Tensor Product Representations

    Authors: Paul Smolensky, Moontae Lee, Xiaodong He, Wen-tau Yih, Jianfeng Gao, Li Deng

    Abstract: In this paper we present the initial development of a general theory for mapping inference in predicate logic to computation over Tensor Product Representations (TPRs; Smolensky (1990), Smolensky & Legendre (2006)). After an initial brief synopsis of TPRs (Section 0), we begin with particular examples of inference with TPRs in the 'bAbI' question-answering task of Weston et al. (2015) (Section 1).… ▽ More

    Submitted 12 January, 2016; originally announced January 2016.

  27. arXiv:1511.06426  [pdf, ps, other

    cs.CL

    Reasoning in Vector Space: An Exploratory Study of Question Answering

    Authors: Moontae Lee, Xiaodong He, Wen-tau Yih, Jianfeng Gao, Li Deng, Paul Smolensky

    Abstract: Question answering tasks have shown remarkable progress with distributed vector representation. In this paper, we investigate the recently proposed Facebook bAbI tasks which consist of twenty different categories of questions that require complex reasoning. Because the previous work on bAbI are all end-to-end models, errors could come from either an imperfect understanding of semantics or in certa… ▽ More

    Submitted 26 February, 2016; v1 submitted 19 November, 2015; originally announced November 2015.