Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–21 of 21 results for author: Creswell, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.14275  [pdf, other

    cs.LG cs.AI cs.CL

    Solving math word problems with process- and outcome-based feedback

    Authors: Jonathan Uesato, Nate Kushman, Ramana Kumar, Francis Song, Noah Siegel, Lisa Wang, Antonia Creswell, Geoffrey Irving, Irina Higgins

    Abstract: Recent work has shown that asking language models to generate reasoning steps improves performance on many reasoning tasks. When moving beyond prompting, this raises the question of how we should supervise such models: outcome-based approaches which supervise the final result, or process-based approaches which supervise the reasoning process itself? Differences between these approaches might natur… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

  2. arXiv:2208.14271  [pdf, other

    cs.AI cs.CL

    Faithful Reasoning Using Large Language Models

    Authors: Antonia Creswell, Murray Shanahan

    Abstract: Although contemporary large language models (LMs) demonstrate impressive question-answering capabilities, their answers are typically the product of a single call to the model. This entails an unwelcome degree of opacity and compromises performance, especially on problems that are inherently multi-step. To address these limitations, we show how LMs can be made to perform faithful multi-step reason… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

  3. arXiv:2207.07051  [pdf, other

    cs.CL cs.AI cs.LG

    Language models show human-like content effects on reasoning tasks

    Authors: Ishita Dasgupta, Andrew K. Lampinen, Stephanie C. Y. Chan, Hannah R. Sheahan, Antonia Creswell, Dharshan Kumaran, James L. McClelland, Felix Hill

    Abstract: Abstract reasoning is a key ability for an intelligent system. Large language models (LMs) achieve above-chance performance on abstract reasoning tasks, but exhibit many imperfections. However, human abstract reasoning is also imperfect. For example, human reasoning is affected by our real-world knowledge and beliefs, and shows notable "content effects"; humans reason more reliably when the semant… ▽ More

    Submitted 30 October, 2023; v1 submitted 14 July, 2022; originally announced July 2022.

  4. arXiv:2205.09712  [pdf, other

    cs.AI cs.CL

    Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning

    Authors: Antonia Creswell, Murray Shanahan, Irina Higgins

    Abstract: Large language models (LLMs) have been shown to be capable of impressive few-shot generalisation to new tasks. However, they still tend to perform poorly on multi-step logical reasoning problems. Here we carry out a comprehensive evaluation of LLMs on 50 tasks that probe different aspects of logical reasoning. We show that language models tend to perform fairly well at single step inference or ent… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  5. arXiv:2204.02329  [pdf, other

    cs.CL cs.AI cs.LG

    Can language models learn from explanations in context?

    Authors: Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, Felix Hill

    Abstract: Language Models (LMs) can perform new tasks by adapting to a few in-context examples. For humans, explanations that connect examples to task principles can improve learning. We therefore investigate whether explanations of few-shot examples can help LMs. We annotate questions from 40 challenging tasks with answer explanations, and various matched control explanations. We evaluate how different typ… ▽ More

    Submitted 10 October, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Findings of EMNLP 2022

  6. arXiv:2112.11446  [pdf, other

    cs.CL cs.AI

    Scaling Language Models: Methods, Analysis & Insights from Training Gopher

    Authors: Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor , et al. (55 additional authors not shown)

    Abstract: Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gop… ▽ More

    Submitted 21 January, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: 120 pages

  7. arXiv:2106.03849  [pdf, other

    cs.CV cs.LG

    SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

    Authors: Rishabh Kabra, Daniel Zoran, Goker Erdogan, Loic Matthey, Antonia Creswell, Matthew Botvinick, Alexander Lerchner, Christopher P. Burgess

    Abstract: To help agents reason about scenes in terms of their building blocks, we wish to extract the compositional structure of any given scene (in particular, the configuration and characteristics of objects comprising the scene). This problem is especially difficult when scene structure needs to be inferred while also estimating the agent's location/viewpoint, as the two variables jointly give rise to t… ▽ More

    Submitted 6 December, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: Animated figures are available at https://sites.google.com/view/simone-scene-understanding/

  8. arXiv:2103.04693  [pdf, other

    cs.CV cs.AI

    Unsupervised Object-Based Transition Models for 3D Partially Observable Environments

    Authors: Antonia Creswell, Rishabh Kabra, Chris Burgess, Murray Shanahan

    Abstract: We present a slot-wise, object-based transition model that decomposes a scene into objects, aligns them (with respect to a slot-wise object memory) to maintain a consistent order across time, and predicts how those objects evolve over successive frames. The model is trained end-to-end without supervision using losses at the level of the object-structured representation rather than pixels. Thanks t… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.

  9. arXiv:2007.08973  [pdf, other

    cs.CV cs.AI cs.LG

    AlignNet: Unsupervised Entity Alignment

    Authors: Antonia Creswell, Kyriacos Nikiforou, Oriol Vinyals, Andre Saraiva, Rishabh Kabra, Loic Matthey, Chris Burgess, Malcolm Reynolds, Richard Tanburn, Marta Garnelo, Murray Shanahan

    Abstract: Recently developed deep learning models are able to learn to segment scenes into component objects without supervision. This opens many new and exciting avenues of research, allowing agents to take objects (or entities) as inputs, rather that pixels. Unfortunately, while these models provide excellent segmentation of a single frame, they do not keep track of how objects segmented at one time-step… ▽ More

    Submitted 21 July, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

  10. arXiv:1905.10307  [pdf, other

    cs.LG stat.ML

    An Explicitly Relational Neural Network Architecture

    Authors: Murray Shanahan, Kyriacos Nikiforou, Antonia Creswell, Christos Kaplanis, David Barrett, Marta Garnelo

    Abstract: With a view to bridging the gap between deep learning and symbolic AI, we present a novel end-to-end neural network architecture that learns to form propositional representations with an explicitly relational structure from raw pixel data. In order to evaluate and analyse the architecture, we introduce a family of simple visual relational reasoning tasks of varying complexity. We show that the pro… ▽ More

    Submitted 23 June, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

    Comments: In Proceedings ICML 2020

  11. arXiv:1802.05701  [pdf, other

    cs.CV

    Inverting The Generator Of A Generative Adversarial Network (II)

    Authors: Antonia Creswell, Anil A Bharath

    Abstract: Generative adversarial networks (GANs) learn a deep generative model that is able to synthesise novel, high-dimensional data samples. New data samples are synthesised by passing latent samples, drawn from a chosen prior distribution, through the generative model. Once trained, the latent space exhibits interesting properties, that may be useful for down stream tasks such as classification or retri… ▽ More

    Submitted 15 February, 2018; originally announced February 2018.

    Comments: Under review at IEEE TNNLS

  12. arXiv:1801.00693  [pdf, other

    cs.CV

    Denoising Adversarial Autoencoders: Classifying Skin Lesions Using Limited Labelled Training Data

    Authors: Antonia Creswell, Alison Pouplin, Anil A Bharath

    Abstract: We propose a novel deep learning model for classifying medical images in the setting where there is a large amount of unlabelled medical data available, but labelled data is in limited supply. We consider the specific case of classifying skin lesions as either malignant or benign. In this setting, the proposed approach -- the semi-supervised, denoising adversarial autoencoder -- is able to utilise… ▽ More

    Submitted 2 January, 2018; originally announced January 2018.

    Comments: Under consideration for the IET Computer Vision Journal special issue on "Computer Vision in Cancer Data Analysis"

  13. arXiv:1711.05175  [pdf, other

    cs.CV

    Adversarial Information Factorization

    Authors: Antonia Creswell, Yumnah Mohamied, Biswa Sengupta, Anil A Bharath

    Abstract: We propose a novel generative model architecture designed to learn representations for images that factor out a single attribute from the rest of the representation. A single object may have many attributes which when altered do not change the identity of the object itself. Consider the human face; the identity of a particular person is independent of whether or not they happen to be wearing glass… ▽ More

    Submitted 28 September, 2018; v1 submitted 14 November, 2017; originally announced November 2017.

  14. arXiv:1711.02879  [pdf, other

    cs.LG cs.CR

    LatentPoison - Adversarial Attacks On The Latent Space

    Authors: Antonia Creswell, Anil A. Bharath, Biswa Sengupta

    Abstract: Robustness and security of machine learning (ML) systems are intertwined, wherein a non-robust ML system (classifiers, regressors, etc.) can be subject to attacks using a wide variety of exploits. With the advent of scalable deep learning methodologies, a lot of emphasis has been put on the robustness of supervised, unsupervised and reinforcement learning algorithms. Here, we study the robustness… ▽ More

    Submitted 8 November, 2017; originally announced November 2017.

    Comments: Submitted to ICLR 2018

  15. Generative Adversarial Networks: An Overview

    Authors: Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, Anil A Bharath

    Abstract: Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can be learned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transf… ▽ More

    Submitted 19 October, 2017; originally announced October 2017.

    Comments: Accepted in the IEEE Signal Processing Magazine Special Issue on Deep Learning for Visual Understanding

  16. arXiv:1708.08487  [pdf, other

    cs.CV cs.LG stat.ML

    On denoising autoencoders trained to minimise binary cross-entropy

    Authors: Antonia Creswell, Kai Arulkumaran, Anil A. Bharath

    Abstract: Denoising autoencoders (DAEs) are powerful deep learning models used for feature extraction, data generation and network pre-training. DAEs consist of an encoder and decoder which may be trained simultaneously to minimise a loss (function) between an input and the reconstruction of a corrupted version of the input. There are two common loss functions used for training autoencoders, these include t… ▽ More

    Submitted 9 October, 2017; v1 submitted 28 August, 2017; originally announced August 2017.

    Comments: Submitted to Pattern Recognition Letters

  17. arXiv:1703.01220  [pdf, other

    cs.CV cs.LG stat.ML

    Denoising Adversarial Autoencoders

    Authors: Antonia Creswell, Anil Anthony Bharath

    Abstract: Unsupervised learning is of growing interest because it unlocks the potential held in vast amounts of unlabelled data to learn useful representations for inference. Autoencoders, a form of generative model, may be trained by learning to reconstruct unlabelled input data from a latent representation space. More robust representations may be produced by an autoencoder if it learns to recover clean i… ▽ More

    Submitted 4 January, 2018; v1 submitted 3 March, 2017; originally announced March 2017.

    Comments: submitted to journal

  18. arXiv:1611.05644  [pdf, other

    cs.CV cs.LG

    Inverting The Generator Of A Generative Adversarial Network

    Authors: Antonia Creswell, Anil Anthony Bharath

    Abstract: Generative adversarial networks (GANs) learn to synthesise new samples from a high-dimensional distribution by passing samples drawn from a latent space through a generative network. When the high-dimensional distribution describes images of a particular data set, the network should learn to generate visually similar image samples for latent variables that are close to each other in the latent spa… ▽ More

    Submitted 17 November, 2016; originally announced November 2016.

    Comments: Accepted at NIPS 2016 Workshop on Adversarial Training

  19. arXiv:1610.09296  [pdf, other

    cs.LG cs.AI stat.ML

    Improving Sampling from Generative Autoencoders with Markov Chains

    Authors: Antonia Creswell, Kai Arulkumaran, Anil Anthony Bharath

    Abstract: We focus on generative autoencoders, such as variational or adversarial autoencoders, which jointly learn a generative model alongside an inference model. Generative autoencoders are those which are trained to softly enforce a prior on the latent distribution learned by the inference model. We call the distribution to which the inference model maps observed samples, the learned latent distribution… ▽ More

    Submitted 12 January, 2017; v1 submitted 28 October, 2016; originally announced October 2016.

  20. arXiv:1609.08661  [pdf, other

    cs.CV

    Task Specific Adversarial Cost Function

    Authors: Antonia Creswell, Anil A. Bharath

    Abstract: The cost function used to train a generative model should fit the purpose of the model. If the model is intended for tasks such as generating perceptually correct samples, it is beneficial to maximise the likelihood of a sample drawn from the model, Q, coming from the same distribution as the training data, P. This is equivalent to minimising the Kullback-Leibler (KL) distance, KL[Q||P]. However,… ▽ More

    Submitted 27 September, 2016; originally announced September 2016.

    Comments: Submitted to TPAMI

  21. Adversarial Training For Sketch Retrieval

    Authors: Antonia Creswell, Anil Anthony Bharath

    Abstract: Generative Adversarial Networks (GAN) are able to learn excellent representations for unlabelled data which can be applied to image generation and scene classification. Representations learned by GANs have not yet been applied to retrieval. In this paper, we show that the representations learned by GANs can indeed be used for retrieval. We consider heritage documents that contain unlabelled Mercha… ▽ More

    Submitted 23 August, 2016; v1 submitted 10 July, 2016; originally announced July 2016.

    Comments: Accepted to ECCV2016 VisArt Workshop