Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–23 of 23 results for author: Hewitt, J

Searching in archive cs. Search in all archives.
.
  1. Learning Translations via Matrix Completion

    Authors: Derry Wijaya, Brendan Callahan, John Hewitt, Jie Gao, Xiao Ling, Marianna Apidianaki, Chris Callison-Burch

    Abstract: Bilingual Lexicon Induction is the task of learning word translations without bilingual parallel corpora. We model this task as a matrix completion problem, and present an effective and extendable framework for completing the matrix. This method harnesses diverse bilingual and monolingual signals, each of which may be incomplete or noisy. Our model achieves state-of-the-art performance for both hi… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: This is a late posting of an old paper as Google Scholar somehow misses indexing the ACL anthology version of the paper

    ACM Class: I.2.7

    Journal ref: Volume: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Year: 2017, Pages: 1452-1463

  2. arXiv:2402.06155  [pdf, other

    cs.CL

    Model Editing with Canonical Examples

    Authors: John Hewitt, Sarah Chen, Lanruo Lora Xie, Edward Adams, Percy Liang, Christopher D. Manning

    Abstract: We introduce model editing with canonical examples, a setting in which (1) a single learning example is provided per desired behavior, (2) evaluation is performed exclusively out-of-distribution, and (3) deviation from an initial model is strictly limited. A canonical example is a simple instance of good behavior, e.g., The capital of Mauritius is Port Louis) or bad behavior, e.g., An aspect of re… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  3. arXiv:2312.10944  [pdf

    cs.CV

    From Whole-slide Image to Biomarker Prediction: A Protocol for End-to-End Deep Learning in Computational Pathology

    Authors: Omar S. M. El Nahhas, Marko van Treeck, Georg Wölflein, Michaela Unger, Marta Ligero, Tim Lenz, Sophia J. Wagner, Katherine J. Hewitt, Firas Khader, Sebastian Foersch, Daniel Truhn, Jakob Nikolas Kather

    Abstract: Hematoxylin- and eosin (H&E) stained whole-slide images (WSIs) are the foundation of diagnosis of cancer. In recent years, development of deep learning-based methods in computational pathology enabled the prediction of biomarkers directly from WSIs. However, accurately linking tissue phenotype to biomarkers at scale remains a crucial challenge for democratizing complex biomarkers in precision onco… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  4. arXiv:2310.12751  [pdf, other

    cs.CL

    Character-level Chinese Backpack Language Models

    Authors: Hao Sun, John Hewitt

    Abstract: The Backpack is a Transformer alternative shown to improve interpretability in English language modeling by decomposing predictions into a weighted sum of token sense components. However, Backpacks' reliance on token-defined meaning raises questions as to their potential for languages other than English, a language for which subword tokenization provides a reasonable approximation for lexical item… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: BlackboxNLP 2023 Camera-Ready

  5. arXiv:2310.01693  [pdf, other

    cs.CL

    Closing the Curious Case of Neural Text Degeneration

    Authors: Matthew Finlayson, John Hewitt, Alexander Koller, Swabha Swayamdipta, Ashish Sabharwal

    Abstract: Despite their ubiquity in language generation, it remains unknown why truncation sampling heuristics like nucleus sampling are so effective. We provide a theoretical explanation for the effectiveness of the truncation sampling by proving that truncation methods that discard tokens below some probability threshold (the most common type of truncation) can guarantee that all sampled tokens have nonze… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    MSC Class: 68T50 ACM Class: I.2.7

  6. arXiv:2307.03172  [pdf, other

    cs.CL

    Lost in the Middle: How Language Models Use Long Contexts

    Authors: Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, Percy Liang

    Abstract: While recent language models have the ability to take long contexts as input, relatively little is known about how well they use longer context. We analyze the performance of language models on two tasks that require identifying relevant information in their input contexts: multi-document question answering and key-value retrieval. We find that performance can degrade significantly when changing t… ▽ More

    Submitted 20 November, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: 18 pages, 16 figures. Accepted for publication in Transactions of the Association for Computational Linguistics (TACL), 2023

  7. arXiv:2305.16765  [pdf, other

    cs.CL

    Backpack Language Models

    Authors: John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang

    Abstract: We present Backpacks: a new neural architecture that marries strong modeling performance with an interface for interpretability and control. Backpacks learn multiple non-contextual sense vectors for each word in a vocabulary, and represent a word in a sequence as a context-dependent, non-negative linear combination of sense vectors in this sequence. We find that, after training, sense vectors spec… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: ACL 2023 Camera-Ready

  8. arXiv:2304.05153  [pdf

    cs.CV cs.AI

    Regression-based Deep-Learning predicts molecular biomarkers from pathology slides

    Authors: Omar S. M. El Nahhas, Chiara M. L. Loeffler, Zunamys I. Carrero, Marko van Treeck, Fiona R. Kolbinger, Katherine J. Hewitt, Hannah S. Muti, Mara Graziani, Qinghe Zeng, Julien Calderaro, Nadina Ortiz-Brüchle, Tanwei Yuan, Michael Hoffmeister, Hermann Brenner, Alexander Brobeil, Jorge S. Reis-Filho, Jakob Nikolas Kather

    Abstract: Deep Learning (DL) can predict biomarkers from cancer histopathology. Several clinically approved applications use this technology. Most approaches, however, predict categorical labels, whereas biomarkers are often continuous measurements. We hypothesized that regression-based DL outperforms classification-based DL. Therefore, we developed and evaluated a new self-supervised attention-based weakly… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

  9. A Machine Learning Approach for Player and Position Adjusted Expected Goals in Football (Soccer)

    Authors: James H. Hewitt, Oktay Karakuş

    Abstract: Football is a very result-driven industry, with goals being rarer than in most sports, so having further parameters to judge the performance of teams and individuals is key. Expected Goals (xG) allow further insight than just a scoreline. To tackle the need for further analysis in football, this paper uses machine learning applications that are developed and applied to Football Event data. From th… ▽ More

    Submitted 2 May, 2023; v1 submitted 19 January, 2023; originally announced January 2023.

    Comments: 16 pages, 8 tables, 6 figures

  10. arXiv:2212.03419  [pdf, other

    cs.CL cs.LG

    JamPatoisNLI: A Jamaican Patois Natural Language Inference Dataset

    Authors: Ruth-Ann Armstrong, John Hewitt, Christopher Manning

    Abstract: JamPatoisNLI provides the first dataset for natural language inference in a creole language, Jamaican Patois. Many of the most-spoken low-resource languages are creoles. These languages commonly have a lexicon derived from a major world language and a distinctive grammar reflecting the languages of the original speakers and the process of language birth by creolization. This gives them a distincti… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: 14 pages, 3 figures, Findings of EMNLP 2022

    ACM Class: I.2.7

  11. arXiv:2210.15191  [pdf, other

    cs.CL

    Truncation Sampling as Language Model Desmoothing

    Authors: John Hewitt, Christopher D. Manning, Percy Liang

    Abstract: Long samples of text from neural language models can be of poor quality. Truncation sampling algorithms--like top-$p$ or top-$k$ -- address this by setting some words' probabilities to zero at each step. This work provides framing for the aim of truncation, and an improved algorithm for that aim. We propose thinking of a neural language model as a mixture of a true distribution and a smoothing dis… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Findings of EMNLP, + small fixes

  12. arXiv:2206.10033  [pdf, other

    cs.CV

    Test Time Transform Prediction for Open Set Histopathological Image Recognition

    Authors: Adrian Galdran, Katherine J. Hewitt, Narmin L. Ghaffari, Jakob N. Kather, Gustavo Carneiro, Miguel A. González Ballester

    Abstract: Tissue typology annotation in Whole Slide histological images is a complex and tedious, yet necessary task for the development of computational pathology models. We propose to address this problem by applying Open Set Recognition techniques to the task of jointly classifying tissue that belongs to a set of annotated classes, e.g. clinically relevant tissue categories, while rejecting in test time… ▽ More

    Submitted 27 June, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: Accepted to MICCAI 2022

  13. arXiv:2109.09234  [pdf, other

    cs.CL

    Conditional probing: measuring usable information beyond a baseline

    Authors: John Hewitt, Kawin Ethayarajh, Percy Liang, Christopher D. Manning

    Abstract: Probing experiments investigate the extent to which neural representations make properties -- like part-of-speech -- predictable. One suggests that a representation encodes a property if probing that representation produces higher accuracy than probing a baseline representation like non-contextual word embeddings. Instead of using baselines as a point of comparison, we're interested in measuring i… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 + typo fixes

  14. arXiv:2108.07258  [pdf, other

    cs.LG cs.AI cs.CY

    On the Opportunities and Risks of Foundation Models

    Authors: Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh , et al. (89 additional authors not shown)

    Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their cap… ▽ More

    Submitted 12 July, 2022; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Report page with citation guidelines: https://crfm.stanford.edu/report.html

  15. arXiv:2104.09635  [pdf, other

    cs.CL

    Refining Targeted Syntactic Evaluation of Language Models

    Authors: Benjamin Newman, Kai-Siang Ang, Julia Gong, John Hewitt

    Abstract: Targeted syntactic evaluation of subject-verb number agreement in English (TSE) evaluates language models' syntactic knowledge using hand-crafted minimal pairs of sentences that differ only in the main verb's conjugation. The method evaluates whether language models rate each grammatical sentence as more likely than its ungrammatical counterpart. We identify two distinct goals for TSE. First, eval… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: 14 pages, 5 figures, 3 tables. To appear at NAACL 2021

    ACM Class: I.2.7

  16. arXiv:2104.08197  [pdf, other

    cs.LG cs.CL

    Probing artificial neural networks: insights from neuroscience

    Authors: Anna A. Ivanova, John Hewitt, Noga Zaslavsky

    Abstract: A major challenge in both neuroscience and machine learning is the development of useful tools for understanding complex information processing systems. One such tool is probes, i.e., supervised models that relate features of interest to activation patterns arising in biological or artificial neural networks. Neuroscience has paved the way in using such models through numerous studies conducted in… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

    Comments: ICLR 2021 Workshop: How Can Findings About The Brain Improve AI Systems?

  17. arXiv:2010.07515  [pdf, other

    cs.CL

    RNNs can generate bounded hierarchical languages with optimal memory

    Authors: John Hewitt, Michael Hahn, Surya Ganguli, Percy Liang, Christopher D. Manning

    Abstract: Recurrent neural networks empirically generate natural language with high syntactic fidelity. However, their success is not well-understood theoretically. We provide theoretical insight into this success, proving in a finite-precision setting that RNNs can efficiently generate bounded hierarchical languages that reflect the scaffolding of natural language syntax. We introduce Dyck-($k$,$m$), the l… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: EMNLP2020 + appendix typo fixes

  18. arXiv:2010.07174  [pdf, other

    cs.CL

    The EOS Decision and Length Extrapolation

    Authors: Benjamin Newman, John Hewitt, Percy Liang, Christopher D. Manning

    Abstract: Extrapolation to unseen sequence lengths is a challenge for neural generative models of language. In this work, we characterize the effect on length extrapolation of a modeling decision often overlooked: predicting the end of the generative process through the use of a special end-of-sequence (EOS) vocabulary item. We study an oracle setting - forcing models to generate to the correct sequence len… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: 16 page, 7 Figures, 9 Tables, Blackbox NLP Workshop at EMNLP 2020

  19. arXiv:2005.04511  [pdf, other

    cs.CL cs.LG

    Finding Universal Grammatical Relations in Multilingual BERT

    Authors: Ethan A. Chi, John Hewitt, Christopher D. Manning

    Abstract: Recent work has found evidence that Multilingual BERT (mBERT), a transformer-based multilingual masked language model, is capable of zero-shot cross-lingual transfer, suggesting that some aspects of its representations are shared cross-lingually. To better understand this overlap, we extend recent work on finding syntactic trees in neural networks' internal representations to the multilingual sett… ▽ More

    Submitted 20 May, 2020; v1 submitted 9 May, 2020; originally announced May 2020.

    Comments: To appear in ACL 2020; Farsi typo corrected

    ACM Class: I.2.7

  20. arXiv:1909.03368  [pdf, other

    cs.CL

    Designing and Interpreting Probes with Control Tasks

    Authors: John Hewitt, Percy Liang

    Abstract: Probes, supervised models trained to predict properties (like parts-of-speech) from representations (like ELMo), have achieved high accuracy on a range of linguistic tasks. But does this mean that the representations encode linguistic structure or just that the probe has learned the linguistic task? In this paper, we propose control tasks, which associate word types with random outputs, to complem… ▽ More

    Submitted 7 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  21. arXiv:1903.08268  [pdf, other

    cs.CL cs.LG

    Simple, Fast, Accurate Intent Classification and Slot Labeling for Goal-Oriented Dialogue Systems

    Authors: Arshit Gupta, John Hewitt, Katrin Kirchhoff

    Abstract: With the advent of conversational assistants, like Amazon Alexa, Google Now, etc., dialogue systems are gaining a lot of traction, especially in industrial setting. These systems typically consist of Spoken Language understanding component which, in turn, consists of two tasks - Intent Classification (IC) and Slot Labeling (SL). Generally, these two tasks are modeled together jointly to achieve be… ▽ More

    Submitted 17 July, 2019; v1 submitted 19 March, 2019; originally announced March 2019.

    Comments: SIGDIAL 2019

  22. arXiv:1803.00188  [pdf, ps, other

    cs.CL

    XNMT: The eXtensible Neural Machine Translation Toolkit

    Authors: Graham Neubig, Matthias Sperber, Xinyi Wang, Matthieu Felix, Austin Matthews, Sarguna Padmanabhan, Ye Qi, Devendra Singh Sachan, Philip Arthur, Pierre Godard, John Hewitt, Rachid Riad, Liming Wang

    Abstract: This paper describes XNMT, the eXtensible Neural Machine Translation toolkit. XNMT distin- guishes itself from other open-source NMT toolkits by its focus on modular code design, with the purpose of enabling fast iteration in research and replicable, reliable results. In this paper we describe the design of XNMT and its experiment configuration system, and demonstrate its utility on the tasks of m… ▽ More

    Submitted 28 February, 2018; originally announced March 2018.

    Comments: To be presented at AMTA 2018 Open Source Software Showcase

  23. arXiv:1501.05992  [pdf, other

    astro-ph.IM cs.CE

    The Murchison Widefield Array Correlator

    Authors: S. M. Ord, B. Crosse, D. Emrich, D. Pallot, R. B. Wayth, M. A. Clark, S. E. Tremblay, W. Arcus, D. Barnes, M. Bell, G. Bernardi, N. D. R. Bhat, J. D. Bowman, F. Briggs, J. D. Bunton, R. J. Cappallo, B. E. Corey, A. A. Deshpande, L. deSouza, A. Ewell-Wice, L. Feng, R. Goeke, L. J. Greenhill, B. J. Hazelton, D. Herne , et al. (42 additional authors not shown)

    Abstract: The Murchison Widefield Array (MWA) is a Square Kilometre Array (SKA) Precursor. The telescope is located at the Murchison Radio--astronomy Observatory (MRO) in Western Australia (WA). The MWA consists of 4096 dipoles arranged into 128 dual polarisation aperture arrays forming a connected element interferometer that cross-correlates signals from all 256 inputs. A hybrid approach to the correlation… ▽ More

    Submitted 23 January, 2015; originally announced January 2015.

    Comments: 17 pages, 9 figures. Accepted for publication in PASA. Some figures altered to meet astro-ph submission requirements