Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–2 of 2 results for author: Suresh, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2205.05124  [pdf, other

    cs.CL cs.AI cs.LG

    Extracting Latent Steering Vectors from Pretrained Language Models

    Authors: Nishant Subramani, Nivedita Suresh, Matthew E. Peters

    Abstract: Prior work on controllable text generation has focused on learning how to control language models through trainable decoding, smart-prompt design, or fine-tuning based on a desired objective. We hypothesize that the information needed to steer the model to generate a target sentence is already encoded within the model. Accordingly, we explore a different approach altogether: extracting latent vect… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: Accepted to ACL2022 Findings; 16 pages (9 pages plus references and appendices); Code: https://github.com/nishantsubramani/steering_vectors; Some text overlap with arXiv:2008.09049

  2. arXiv:2008.09049  [pdf, other

    cs.CL cs.LG

    Discovering Useful Sentence Representations from Large Pretrained Language Models

    Authors: Nishant Subramani, Nivedita Suresh

    Abstract: Despite the extensive success of pretrained language models as encoders for building NLP systems, they haven't seen prominence as decoders for sequence generation tasks. We explore the question of whether these models can be adapted to be used as universal decoders. To be considered "universal," a decoder must have an implicit representation for any target sentence $s$, such that it can recover th… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

    Comments: 13 pages, 11 figures, 2 tables