Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–14 of 14 results for author: Hennigen, L T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.17710  [pdf, other

    cs.CL cs.LG

    Principled Gradient-based Markov Chain Monte Carlo for Text Generation

    Authors: Li Du, Afra Amini, Lucas Torroba Hennigen, Xinyan Velocity Yu, Jason Eisner, Holden Lee, Ryan Cotterell

    Abstract: Recent papers have demonstrated the possibility of energy-based text generation by adapting gradient-based sampling algorithms, a paradigm of MCMC algorithms that promises fast convergence. However, as we show in this paper, previous attempts on this approach to text generation all fail to sample correctly from the target language model distributions. To address this limitation, we consider the pr… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

    Comments: Preprint

  2. arXiv:2311.09188  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Verifiable Text Generation with Symbolic References

    Authors: Lucas Torroba Hennigen, Shannon Shen, Aniruddha Nrusimha, Bernhard Gapp, David Sontag, Yoon Kim

    Abstract: LLMs are vulnerable to hallucinations, and thus their outputs generally require laborious human verification for high-stakes applications. To this end, we propose symbolically grounded generation (SymGen) as a simple approach for enabling easier manual validation of an LLM's output. SymGen prompts an LLM to interleave its regular output text with explicit symbolic references to fields present in s… ▽ More

    Submitted 15 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: 57 pages, 8 figures, 8 tables

  3. arXiv:2307.03056  [pdf, other

    cs.LG cs.AI cs.CL

    Generalizing Backpropagation for Gradient-Based Interpretability

    Authors: Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alexander Warstadt, Ryan Cotterell

    Abstract: Many popular feature-attribution methods for interpreting deep neural networks rely on computing the gradients of a model's output with respect to its inputs. While these methods can indicate which input features may be important for the model's prediction, they reveal little about the inner workings of the model itself. In this paper, we observe that the gradient computation of a model is a speci… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: Long paper accepted at ACL 2023

  4. arXiv:2305.15501  [pdf, other

    cs.CL

    Deriving Language Models from Masked Language Models

    Authors: Lucas Torroba Hennigen, Yoon Kim

    Abstract: Masked language models (MLM) do not explicitly define a distribution over language, i.e., they are not language models per se. However, recent work has implicitly treated them as such for the purposes of generation and scoring. This paper studies methods for deriving explicit joint distributions from MLMs, focusing on distributions over two tokens, which makes it possible to calculate exact distri… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  5. arXiv:2303.00980  [pdf, other

    cs.LG

    Learning to Grow Pretrained Models for Efficient Transformer Training

    Authors: Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Daniel Cox, Zhangyang Wang, Yoon Kim

    Abstract: Scaling transformers has led to significant breakthroughs in many domains, leading to a paradigm in which larger versions of existing models are trained and released on a periodic basis. New instances of such models are typically trained completely from scratch, despite the fact that they are often just scaled-up versions of their smaller counterparts. How can we use the implicit knowledge in the… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: International Conference on Learning Representations (ICLR), 2023

  6. A Measure-Theoretic Characterization of Tight Language Models

    Authors: Li Du, Lucas Torroba Hennigen, Tiago Pimentel, Clara Meister, Jason Eisner, Ryan Cotterell

    Abstract: Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings. In most cases, the estimated distribution sums to 1 over all finite strings. However, in some pathological cases, probability mass can ``leak'' onto the set of infinite sequences. In order to characterize the notion of leakage more precisely, this paper offers a measure-th… ▽ More

    Submitted 21 August, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: 25 pages; ACL 2023 camera ready

  7. arXiv:2210.03971  [pdf, other

    cs.LG stat.AP

    An Ordinal Latent Variable Model of Conflict Intensity

    Authors: Niklas Stoehr, Lucas Torroba Hennigen, Josef Valvoda, Robert West, Ryan Cotterell, Aaron Schein

    Abstract: Measuring the intensity of events is crucial for monitoring and tracking armed conflict. Advances in automated event extraction have yielded massive data sets of "who did what to whom" micro-records that enable data-driven approaches to monitoring conflict. The Goldstein scale is a widely-used expert-based measure that scores events on a conflictual-cooperative scale. It is based only on the actio… ▽ More

    Submitted 4 June, 2023; v1 submitted 8 October, 2022; originally announced October 2022.

    Comments: Long Paper at ACL 2023

  8. arXiv:2205.03608  [pdf, other

    cs.CL

    UniMorph 4.0: Universal Morphology

    Authors: Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Benoît Sagot, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay , et al. (71 additional authors not shown)

    Abstract: The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema. This pa… ▽ More

    Submitted 19 June, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: LREC 2022; The first two authors made equal contributions

  9. arXiv:2205.02023  [pdf, other

    cs.CL

    Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models

    Authors: Karolina Stańczak, Edoardo Ponti, Lucas Torroba Hennigen, Ryan Cotterell, Isabelle Augenstein

    Abstract: The success of multilingual pre-trained models is underpinned by their ability to learn representations shared by multiple languages even in absence of any explicit supervision. However, it remains unclear how these models learn to generalise across languages. In this work, we conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar. In particular, w… ▽ More

    Submitted 8 May, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022 (Main Conference)

  10. arXiv:2201.08214  [pdf, other

    cs.CL

    A Latent-Variable Model for Intrinsic Probing

    Authors: Karolina Stańczak, Lucas Torroba Hennigen, Adina Williams, Ryan Cotterell, Isabelle Augenstein

    Abstract: The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic information. Indeed, it is natural to assume that these pre-trained representations do encode some level of linguistic knowledge as they have brought about large empirical improvements on a wide variety of NLP tasks, which suggests they are learning true linguistic gene… ▽ More

    Submitted 11 July, 2024; v1 submitted 20 January, 2022; originally announced January 2022.

  11. arXiv:2110.08388  [pdf, other

    cs.CL

    Probing as Quantifying Inductive Bias

    Authors: Alexander Immer, Lucas Torroba Hennigen, Vincent Fortuin, Ryan Cotterell

    Abstract: Pre-trained contextual representations have led to dramatic performance improvements on a range of downstream tasks. Such performance improvements have motivated researchers to quantify and understand the linguistic information encoded in these representations. In general, researchers quantify the amount of linguistic information through probing, an endeavor which consists of training a supervised… ▽ More

    Submitted 24 March, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ACL 2022

  12. arXiv:2109.12860  [pdf, other

    cs.CL cs.LG cs.SI

    Classifying Dyads for Militarized Conflict Analysis

    Authors: Niklas Stoehr, Lucas Torroba Hennigen, Samin Ahbab, Robert West, Ryan Cotterell

    Abstract: Understanding the origins of militarized conflict is a complex, yet important undertaking. Existing research seeks to build this understanding by considering bi-lateral relationships between entity pairs (dyadic causes) and multi-lateral relationships among multiple entities (systemic causes). The aim of this work is to compare these two causes in terms of how they correlate with conflict between… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)

  13. arXiv:2010.02812  [pdf, other

    cs.CL

    Intrinsic Probing through Dimension Selection

    Authors: Lucas Torroba Hennigen, Adina Williams, Ryan Cotterell

    Abstract: Most modern NLP systems make use of pre-trained contextual representations that attain astonishingly high performance on a variety of tasks. Such high performance should not be possible unless some form of linguistic structure inheres in these representations, and a wealth of research has sprung up on probing for it. In this paper, we draw a distinction between intrinsic probing, which examines ho… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: To appear EMNLP 2020

  14. arXiv:2006.11572  [pdf, other

    cs.CL

    SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection

    Authors: Ekaterina Vylomova, Jennifer White, Elizabeth Salesky, Sabrina J. Mielke, Shijie Wu, Edoardo Ponti, Rowan Hall Maudslay, Ran Zmigrod, Josef Valvoda, Svetlana Toldova, Francis Tyers, Elena Klyachko, Ilya Yegorov, Natalia Krizhanovsky, Paula Czarnowska, Irene Nikkarinen, Andrew Krizhanovsky, Tiago Pimentel, Lucas Torroba Hennigen, Christo Kirov, Garrett Nicolai, Adina Williams, Antonios Anastasopoulos, Hilaria Cruz, Eleanor Chodroff , et al. (3 additional authors not shown)

    Abstract: A broad goal in natural language processing (NLP) is to develop a system that has the capacity to process any natural language. Most systems, however, are developed using data from just one language such as English. The SIGMORPHON 2020 shared task on morphological reinflection aims to investigate systems' ability to generalize across typologically distinct languages, many of which are low resource… ▽ More

    Submitted 14 July, 2020; v1 submitted 20 June, 2020; originally announced June 2020.

    Comments: 39 pages, SIGMORPHON