Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 123 results for author: Lukasiewicz, T

.
  1. arXiv:2407.01163  [pdf, other

    cs.LG cs.CV

    Benchmarking Predictive Coding Networks -- Made Simple

    Authors: Luca Pinchetti, Chang Qi, Oleh Lokshyn, Gaspard Olivers, Cornelius Emde, Mufeng Tang, Amine M'Charrak, Simon Frieder, Bayar Menzat, Rafal Bogacz, Thomas Lukasiewicz, Tommaso Salvatori

    Abstract: In this work, we tackle the problems of efficiency and scalability for predictive coding networks in machine learning. To do so, we first propose a library called PCX, whose focus lies on performance and simplicity, and provides a user-friendly, deep-learning oriented interface. Second, we use PCX to implement a large set of benchmarks for the community to use for their experiments. As most works… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 33 pages, 25 figures

    ACM Class: I.2.6

  2. arXiv:2405.13922  [pdf, other

    cs.LG stat.ML

    Towards Certification of Uncertainty Calibration under Adversarial Attacks

    Authors: Cornelius Emde, Francesco Pinto, Thomas Lukasiewicz, Philip H. S. Torr, Adel Bibi

    Abstract: Since neural classifiers are known to be sensitive to adversarial perturbations that alter their accuracy, \textit{certification methods} have been developed to provide provable guarantees on the insensitivity of their predictions to such perturbations. Furthermore, in safety-critical applications, the frequentist interpretation of the confidence of a classifier (also known as model calibration) c… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 11 pages main paper, appendix included

  3. arXiv:2402.18285  [pdf, other

    cs.LG cs.AI cs.LO

    PiShield: A PyTorch Package for Learning with Requirements

    Authors: Mihaela Cătălina Stoian, Alex Tatomir, Thomas Lukasiewicz, Eleonora Giunchiglia

    Abstract: Deep learning models have shown their strengths in various application domains, however, they often struggle to meet safety requirements for their outputs. In this paper, we introduce PiShield, the first package ever allowing for the integration of the requirements into the neural networks' topology. PiShield guarantees compliance with these requirements, regardless of input. Additionally, it allo… ▽ More

    Submitted 14 May, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Demo paper, accepted at IJCAI 2024

  4. arXiv:2402.11362  [pdf

    cs.LG cs.CV

    Exploiting T-norms for Deep Learning in Autonomous Driving

    Authors: Mihaela Cătălina Stoian, Eleonora Giunchiglia, Thomas Lukasiewicz

    Abstract: Deep learning has been at the core of the autonomous driving field development, due to the neural networks' success in finding patterns in raw data and turning them into accurate predictions. Moreover, recent neuro-symbolic works have shown that incorporating the available background knowledge about the problem at hand in the loss function via t-norms can further improve the deep learning models'… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: Published in Proceedings of the 17th International Workshop on Neural-Symbolic Learning and Reasoning, 2023 (NeSy 2023)

  5. arXiv:2402.10814  [pdf, other

    cs.LG cs.CV

    Associative Memories in the Feature Space

    Authors: Tommaso Salvatori, Beren Millidge, Yuhang Song, Rafal Bogacz, Thomas Lukasiewicz

    Abstract: An autoassociative memory model is a function that, given a set of data points, takes as input an arbitrary vector and outputs the most similar data point from the memorized set. However, popular memory models fail to retrieve images even when the corruption is mild and easy to detect for a human evaluator. This is because similarities are evaluated in the raw pixel space, which does not contain a… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 8 Pages, 4 Figures, accepted for publication at ECAI 2023

  6. arXiv:2402.04823  [pdf, other

    cs.LG

    How Realistic Is Your Synthetic Data? Constraining Deep Generative Models for Tabular Data

    Authors: Mihaela Cătălina Stoian, Salijona Dyrmishi, Maxime Cordy, Thomas Lukasiewicz, Eleonora Giunchiglia

    Abstract: Deep Generative Models (DGMs) have been shown to be powerful tools for generating tabular data, as they have been increasingly able to capture the complex distributions that characterize them. However, to generate realistic synthetic data, it is often not enough to have a good approximation of their distribution, as it also requires compliance with constraints that encode essential background know… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted at ICLR 2024

  7. Pre-training and Diagnosing Knowledge Base Completion Models

    Authors: Vid Kocijan, Myeongjun Erik Jang, Thomas Lukasiewicz

    Abstract: In this work, we introduce and analyze an approach to knowledge transfer from one collection of facts to another without the need for entity or relation matching. The method works for both canonicalized knowledge bases and uncanonicalized or open knowledge bases, i.e., knowledge bases where more than one copy of a real-world entity or relation may exist. The main contribution is a method that can… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: Accepted to AIJ, reference to follow. arXiv admin note: substantial text overlap with arXiv:2108.13073

  8. arXiv:2312.04556  [pdf, other

    cs.CL cs.AI cs.LG math.HO

    Large Language Models for Mathematicians

    Authors: Simon Frieder, Julius Berner, Philipp Petersen, Thomas Lukasiewicz

    Abstract: Large language models (LLMs) such as ChatGPT have received immense interest for their general-purpose language understanding and, in particular, their ability to generate high-quality text or computer code. For many professions, LLMs represent an invaluable tool that can speed up and improve the quality of work. In this note, we discuss to what extent they can aid professional mathematicians. We f… ▽ More

    Submitted 2 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Journal ref: International Mathematical News 254 (2023) 1-20

  9. arXiv:2312.00277  [pdf, other

    cs.LG cs.CL

    Text Attribute Control via Closed-Loop Disentanglement

    Authors: Lei Sha, Thomas Lukasiewicz

    Abstract: Changing an attribute of a text without changing the content usually requires to first disentangle the text into irrelevant attributes and content representations. After that, in the inference phase, the representation of one attribute is tuned to a different value, expecting that the corresponding attribute of the text can also be changed accordingly. The usual way of disentanglement is to add so… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    Comments: accepted by TACL 2023

  10. arXiv:2310.15541  [pdf, other

    cs.CL

    Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary

    Authors: Myeongjun Erik Jang, Thomas Lukasiewicz

    Abstract: The non-humanlike behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness. A striking phenomenon of such faulty behaviours is the generation of inconsistent predictions, which produces logically contradictory results, such as generating different predictions for texts delivering the same meaning or violating logical properties. Previous stu… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 15 pages

    Journal ref: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)

  11. arXiv:2310.05355  [pdf, other

    cs.CV

    C^2M-DoT: Cross-modal consistent multi-view medical report generation with domain transfer network

    Authors: Ruizhi Wang, Xiangtao Wang, Jie Zhou, Thomas Lukasiewicz, Zhenghua Xu

    Abstract: In clinical scenarios, multiple medical images with different views are usually generated simultaneously, and these images have high semantic consistency. However, most existing medical report generation methods only consider single-view data. The rich multi-view mutual information of medical images can help generate more accurate reports, however, the dependence of multi-view models on multi-view… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  12. arXiv:2309.04312  [pdf, other

    cs.CV

    AMLP:Adaptive Masking Lesion Patches for Self-supervised Medical Image Segmentation

    Authors: Xiangtao Wang, Ruizhi Wang, Jie Zhou, Thomas Lukasiewicz, Zhenghua Xu

    Abstract: Self-supervised masked image modeling has shown promising results on natural images. However, directly applying such methods to medical images remains challenging. This difficulty stems from the complexity and distinct characteristics of lesions compared to natural images, which impedes effective representation learning. Additionally, conventional high fixed masking ratios restrict reconstructing… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

  13. arXiv:2308.07870  [pdf, other

    cs.AI cs.LG cs.NE

    Brain-Inspired Computational Intelligence via Predictive Coding

    Authors: Tommaso Salvatori, Ankur Mali, Christopher L. Buckley, Thomas Lukasiewicz, Rajesh P. N. Rao, Karl Friston, Alexander Ororbia

    Abstract: Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying unc… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: 37 Pages, 9 Figures

  14. arXiv:2308.02866  [pdf, other

    cs.CV cs.LG

    NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation

    Authors: Jianfeng Wang, Daniela Massiceti, Xiaolin Hu, Vladimir Pavlovic, Thomas Lukasiewicz

    Abstract: Semi-supervised semantic segmentation involves assigning pixel-wise labels to unlabeled images at training time. This is useful in a wide range of real-world applications where collecting pixel-wise labels is not feasible in time or cost. Current approaches to semi-supervised semantic segmentation work by predicting pseudo-labels for each pixel from a class-wise probability distribution output by… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Comments: Appear at ICML2023. Source codes are available at: https://github.com/Jianf-Wang/NP-SemiSeg

  15. arXiv:2306.15479  [pdf, other

    cs.LG

    Predictive Coding beyond Correlations

    Authors: Tommaso Salvatori, Luca Pinchetti, Amine M'Charrak, Beren Millidge, Thomas Lukasiewicz

    Abstract: Recently, there has been extensive research on the capabilities of biologically plausible algorithms. In this work, we show how one of such algorithms, called predictive coding, is able to perform causal inference tasks. First, we show how a simple change in the inference process of predictive coding enables to compute interventions without the need to mutilate or redefine a causal graph. Then, we… ▽ More

    Submitted 3 June, 2024; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: 44 Pages, 24 Figures. Changed title and abstract, following the ICML accepted version

  16. arXiv:2306.14937  [pdf, other

    cs.CV

    Minimum Description Length Clustering to Measure Meaningful Image Complexity

    Authors: Louis Mahon, Thomas Lukasiewicz

    Abstract: Existing image complexity metrics cannot distinguish meaningful content from noise. This means that white noise images, which contain no meaningful information, are judged as highly complex. We present a new image complexity metric through hierarchical clustering of patches. We use the minimum description length principle to determine the number of clusters and designate certain points as outliers… ▽ More

    Submitted 19 August, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

  17. arXiv:2306.04067  [pdf, other

    cs.CL

    An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models

    Authors: Zhongbin Xie, Thomas Lukasiewicz

    Abstract: The increasingly large size of modern pretrained language models not only makes them inherit more human-like biases from the training corpora, but also makes it computationally expensive to mitigate such biases. In this paper, we investigate recent parameter-efficient methods in combination with counterfactual data augmentation (CDA) for bias mitigation. We conduct extensive experiments with prefi… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: accepted to ACL 2023

  18. arXiv:2306.02980  [pdf, other

    cs.CL cs.AI

    KNOW How to Make Up Your Mind! Adversarially Detecting and Alleviating Inconsistencies in Natural Language Explanations

    Authors: Myeongjun Jang, Bodhisattwa Prasad Majumder, Julian McAuley, Thomas Lukasiewicz, Oana-Maria Camburu

    Abstract: While recent works have been considerably improving the quality of the natural language explanations (NLEs) generated by a model to justify its predictions, there is very limited research in detecting and alleviating inconsistencies among generated NLEs. In this work, we leverage external knowledge bases to significantly improve on an existing adversarial attack for detecting inconsistent NLEs. We… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Short paper, ACL 2023

    Journal ref: The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)

  19. arXiv:2306.01694  [pdf, other

    cs.LG cs.HC

    Evaluating Language Models for Mathematics through Interactions

    Authors: Katherine M. Collins, Albert Q. Jiang, Simon Frieder, Lionel Wong, Miri Zilka, Umang Bhatt, Thomas Lukasiewicz, Yuhuai Wu, Joshua B. Tenenbaum, William Hart, Timothy Gowers, Wenda Li, Adrian Weller, Mateja Jamnik

    Abstract: There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs, and is insufficient for making an informed decision about which LLMs and under which assistive settings can they be sensibly used. Static assessment fails to a… ▽ More

    Submitted 5 November, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  20. arXiv:2305.18029  [pdf, other

    cs.CL cs.AI

    Faithfulness Tests for Natural Language Explanations

    Authors: Pepa Atanasova, Oana-Maria Camburu, Christina Lioma, Thomas Lukasiewicz, Jakob Grue Simonsen, Isabelle Augenstein

    Abstract: Explanations of neural models aim to reveal a model's decision-making process for its predictions. However, recent work shows that current methods giving explanations such as saliency maps or counterfactuals can be misleading, as they are prone to present reasons that are unfaithful to the model's inner workings. This work explores the challenging question of evaluating the faithfulness of natural… ▽ More

    Submitted 30 June, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: Short paper, ACL 2023

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)

  21. arXiv:2304.07465  [pdf, other

    cs.CV

    MvCo-DoT:Multi-View Contrastive Domain Transfer Network for Medical Report Generation

    Authors: Ruizhi Wang, Xiangtao Wang, Zhenghua Xu, Wenting Xu, Junyang Chen, Thomas Lukasiewicz

    Abstract: In clinical scenarios, multiple medical images with different views are usually generated at the same time, and they have high semantic consistency. However, the existing medical report generation methods cannot exploit the rich multi-view mutual information of medical images. Therefore, in this work, we propose the first multi-view medical report generation model, called MvCo-DoT. Specifically, M… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

    Comments: Received by the ICASSP2023

  22. arXiv:2304.03674  [pdf, other

    cs.LG cs.AI cs.SE

    Machine Learning with Requirements: a Manifesto

    Authors: Eleonora Giunchiglia, Fergus Imrie, Mihaela van der Schaar, Thomas Lukasiewicz

    Abstract: In the recent years, machine learning has made great advancements that have been at the root of many breakthroughs in different application domains. However, it is still an open issue how make them applicable to high-stakes or safety-critical application domains, as they can often be brittle and unreliable. In this paper, we argue that requirements definition and satisfaction can go a long way to… ▽ More

    Submitted 2 February, 2024; v1 submitted 7 April, 2023; originally announced April 2023.

  23. arXiv:2304.02335  [pdf, other

    cs.LG

    Correcting Flaws in Common Disentanglement Metrics

    Authors: Louis Mahon, Lei Shah, Thomas Lukasiewicz

    Abstract: Recent years have seen growing interest in learning disentangled representations, in which distinct features, such as size or shape, are represented by distinct neurons. Quantifying the extent to which a given representation is disentangled is not straightforward; multiple metrics have been proposed. In this paper, we identify two failings of existing metrics, which mean they can assign a high sco… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

  24. arXiv:2303.16521  [pdf, other

    cs.LG

    Hard Regularization to Prevent Deep Online Clustering Collapse without Data Augmentation

    Authors: Louis Mahon, Thomas Lukasiewicz

    Abstract: Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed. While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster. Successful existin… ▽ More

    Submitted 13 March, 2024; v1 submitted 29 March, 2023; originally announced March 2023.

  25. arXiv:2303.06273  [pdf, other

    cs.CL cs.AI

    Consistency Analysis of ChatGPT

    Authors: Myeongjun Erik Jang, Thomas Lukasiewicz

    Abstract: ChatGPT has gained a huge popularity since its introduction. Its positive aspects have been reported through many media platforms, and some analyses even showed that ChatGPT achieved a decent grade in professional exams, adding extra support to the claim that AI can now assist and even replace humans in industrial fields. Others, however, doubt its reliability and trustworthiness. This paper inves… ▽ More

    Submitted 13 November, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    Comments: 15 pages

    Journal ref: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)

  26. arXiv:2302.13699  [pdf, other

    cs.CV

    MPS-AMS: Masked Patches Selection and Adaptive Masking Strategy Based Self-Supervised Medical Image Segmentation

    Authors: Xiangtao Wang, Ruizhi Wang, Biao Tian, Jiaojiao Zhang, Shuo Zhang, Junyang Chen, Thomas Lukasiewicz, Zhenghua Xu

    Abstract: Existing self-supervised learning methods based on contrastive learning and masked image modeling have demonstrated impressive performances. However, current masked image modeling methods are mainly utilized in natural images, and their applications in medical images are relatively lacking. Besides, their fixed high masking strategy limits the upper bound of conditional mutual information, and the… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: 6 pages, 3 figures,Received by the ICASSP2023

  27. arXiv:2302.11106  [pdf, other

    cs.CV

    Multi-Head Feature Pyramid Networks for Breast Mass Detection

    Authors: Hexiang Zhang, Zhenghua Xu, Dan Yao, Shuo Zhang, Junyang Chen, Thomas Lukasiewicz

    Abstract: Analysis of X-ray images is one of the main tools to diagnose breast cancer. The ability to quickly and accurately detect the location of masses from the huge amount of image data is the key to reducing the morbidity and mortality of breast cancer. Currently, the main factor limiting the accuracy of breast mass detection is the unequal focus on the mass boxes, leading the network to focus too much… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: 7 pages, 3 figures,Received by the ICASSP2023

  28. arXiv:2302.05674  [pdf, other

    cs.CL cs.AI

    Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous Pronouns

    Authors: Zhongbin Xie, Vid Kocijan, Thomas Lukasiewicz, Oana-Maria Camburu

    Abstract: Bias-measuring datasets play a critical role in detecting biased behavior of language models and in evaluating progress of bias mitigation methods. In this work, we focus on evaluating gender bias through coreference resolution, where previous datasets are either hand-crafted or fail to reliably measure an explicitly defined bias. To overcome these shortcomings, we propose a novel method to collec… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: Long Paper at EACL 2023

  29. arXiv:2301.13867  [pdf, other

    cs.LG cs.AI cs.CL

    Mathematical Capabilities of ChatGPT

    Authors: Simon Frieder, Luca Pinchetti, Alexis Chevalier, Ryan-Rhys Griffiths, Tommaso Salvatori, Thomas Lukasiewicz, Philipp Christian Petersen, Julius Berner

    Abstract: We investigate the mathematical capabilities of two iterations of ChatGPT (released 9-January-2023 and 30-January-2023) and of GPT-4 by testing them on publicly available datasets, as well as hand-crafted ones, using a novel methodology. In contrast to formal mathematics, where large databases of formal proofs are available (e.g., the Lean Mathematical Library), current datasets of natural-languag… ▽ More

    Submitted 20 July, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: Added further evaluations on another ChatGPT version and on GPT-4. The GHOSTS and miniGHOSTS datasets are available at https://github.com/xyfrieder/science-GHOSTS

    Journal ref: NeurIPS 2023 Datasets and Benchmarks

  30. arXiv:2301.13569  [pdf, other

    cs.CV cs.LG

    NP-Match: Towards a New Probabilistic Model for Semi-Supervised Learning

    Authors: Jianfeng Wang, Xiaolin Hu, Thomas Lukasiewicz

    Abstract: Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data. In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match. NP-Match is suited to this task for two reasons. Firstly, NP-Match implicitly compares data… ▽ More

    Submitted 25 June, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: An extended version of our previous ICML 2022 paper arXiv:2207.01066 with more experiments

  31. Rationalizing Predictions by Adversarial Information Calibration

    Authors: Lei Sha, Oana-Maria Camburu, Thomas Lukasiewicz

    Abstract: Explaining the predictions of AI models is paramount in safety-critical applications, such as in legal or medical domains. One form of explanation for a prediction is an extractive rationale, i.e., a subset of features of an instance that lead the model to give its prediction on that instance. For example, the subphrase ``he stole the mobile phone'' can be an extractive rationale for the predictio… ▽ More

    Submitted 14 January, 2023; originally announced January 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2012.08884

    Journal ref: Artificial Intelligence, Volume 315, February 2023

  32. arXiv:2212.04656  [pdf, other

    cs.LG

    Robust Graph Representation Learning via Predictive Coding

    Authors: Billy Byiringiro, Tommaso Salvatori, Thomas Lukasiewicz

    Abstract: Predictive coding is a message-passing framework initially developed to model information processing in the brain, and now also topic of research in machine learning due to some interesting properties. One of such properties is the natural ability of generative models to learn robust representations thanks to their peculiar credit assignment rule, that allows neural activities to converge to a sol… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: 27 Pages, 31 Figures

  33. arXiv:2212.00720  [pdf, other

    cs.NE cs.AI cs.LG

    A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive Coding Networks

    Authors: Tommaso Salvatori, Yuhang Song, Yordan Yordanov, Beren Millidge, Zhenghua Xu, Lei Sha, Cornelius Emde, Rafal Bogacz, Thomas Lukasiewicz

    Abstract: Predictive coding networks are neuroscience-inspired models with roots in both Bayesian statistics and neuroscience. Training such models, however, is quite inefficient and unstable. In this work, we show how by simply changing the temporal scheduling of the update rule for the synaptic weights leads to an algorithm that is much more efficient and stable than the original one, and has theoretical… ▽ More

    Submitted 7 February, 2024; v1 submitted 15 November, 2022; originally announced December 2022.

    Comments: Change of title and abstract, that now reflect the version accepted for publication. One co-author also added, that performed the additional experiments

  34. arXiv:2211.07289  [pdf, other

    cs.CV cs.CL cs.LG

    Learning to Model Multimodal Semantic Alignment for Story Visualization

    Authors: Bowen Li, Thomas Lukasiewicz

    Abstract: Story visualization aims to generate a sequence of images to narrate each sentence in a multi-sentence story, where the images should be realistic and keep global consistency across dynamic scenes and characters. Current works face the problem of semantic misalignment because of their fixed architecture and diversity of input modalities. To address this problem, we explore the semantic alignment b… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: EMNLP 2022

  35. arXiv:2211.03481  [pdf, other

    cs.LG cs.NE

    Predictive Coding beyond Gaussian Distributions

    Authors: Luca Pinchetti, Tommaso Salvatori, Yordan Yordanov, Beren Millidge, Yuhang Song, Thomas Lukasiewicz

    Abstract: A large amount of recent research has the far-reaching goal of finding training methods for deep neural networks that can serve as alternatives to backpropagation (BP). A prominent example is predictive coding (PC), which is a neuroscience-inspired method that performs inference on hierarchical Gaussian generative models. These methods, however, fail to keep up with modern neural networks, as they… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  36. arXiv:2210.13729  [pdf, other

    cs.AI cs.CL cs.CV

    Hybrid Reinforced Medical Report Generation with M-Linear Attention and Repetition Penalty

    Authors: Wenting Xu, Zhenghua Xu, Junyang Chen, Chang Qi, Thomas Lukasiewicz

    Abstract: To reduce doctors' workload, deep-learning-based automatic medical report generation has recently attracted more and more research efforts, where deep convolutional neural networks (CNNs) are employed to encode the input images, and recurrent neural networks (RNNs) are used to decode the visual features into medical reports automatically. However, these state-of-the-art methods mainly suffer from… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: This paper is current under peer-review in IEEE TNNLS

  37. arXiv:2210.03985  [pdf, other

    cs.CL

    Bird-Eye Transformers for Text Generation Models

    Authors: Lei Sha, Yuhang Song, Yordan Yordanov, Tommaso Salvatori, Thomas Lukasiewicz

    Abstract: Transformers have become an indispensable module for text generation models since their great success in machine translation. Previous works attribute the~success of transformers to the query-key-value dot-product attention, which provides a robust inductive bias by the fully connected token graphs. However, we found that self-attention has a severe limitation. When predicting the (i+1)-th token,… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

  38. arXiv:2210.01597  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    ROAD-R: The Autonomous Driving Dataset with Logical Requirements

    Authors: Eleonora Giunchiglia, Mihaela Cătălina Stoian, Salman Khan, Fabio Cuzzolin, Thomas Lukasiewicz

    Abstract: Neural networks have proven to be very powerful at computer vision tasks. However, they often exhibit unexpected behaviours, violating known requirements expressing background knowledge. This calls for models (i) able to learn from the requirements, and (ii) guaranteed to be compliant with the requirements themselves. Unfortunately, the development of such models is hampered by the lack of dataset… ▽ More

    Submitted 5 October, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

  39. arXiv:2209.08335  [pdf, ps, other

    cs.LG

    Efficient Deep Clustering of Human Activities and How to Improve Evaluation

    Authors: Louis Mahon, Thomas Lukasiewicz

    Abstract: There has been much recent research on human activity re\-cog\-ni\-tion (HAR), due to the proliferation of wearable sensors in watches and phones, and the advances of deep learning methods, which avoid the need to manually extract features from raw sensor signals. A significant disadvantage of deep learning applied to HAR is the need for manually labelled training data, which is especially difficu… ▽ More

    Submitted 17 September, 2022; originally announced September 2022.

  40. arXiv:2209.03793  [pdf, other

    cs.CV cs.LG

    Lightweight Long-Range Generative Adversarial Networks

    Authors: Bowen Li, Thomas Lukasiewicz

    Abstract: In this paper, we introduce novel lightweight generative adversarial networks, which can effectively capture long-range dependencies in the image generation process, and produce high-quality results with a much simpler architecture. To achieve this, we first introduce a long-range module, allowing the network to dynamically adjust the number of focused sampling pixels and to also augment sampling… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  41. arXiv:2208.07022  [pdf, other

    cs.CV cs.CL cs.LG

    Memory-Driven Text-to-Image Generation

    Authors: Bowen Li, Philip H. S. Torr, Thomas Lukasiewicz

    Abstract: We introduce a memory-driven semi-parametric approach to text-to-image generation, which is based on both parametric and non-parametric techniques. The non-parametric component is a memory bank of image features constructed from a training set of images. The parametric component is a generative adversarial network. Given a new text description at inference time, the memory bank is used to selectiv… ▽ More

    Submitted 15 August, 2022; originally announced August 2022.

  42. arXiv:2208.02341  [pdf, other

    cs.CV cs.CL cs.LG

    Word-Level Fine-Grained Story Visualization

    Authors: Bowen Li, Thomas Lukasiewicz

    Abstract: Story visualization aims to generate a sequence of images to narrate each sentence in a multi-sentence story with a global consistency across dynamic scenes and characters. Current works still struggle with output images' quality and consistency, and rely on additional semantic information or auxiliary captioning networks. To address these challenges, we first introduce a new sentence representati… ▽ More

    Submitted 22 September, 2022; v1 submitted 3 August, 2022; originally announced August 2022.

    Comments: ECCV 2022

  43. arXiv:2207.12316  [pdf, other

    cs.NE cs.AI cs.LG

    A Theoretical Framework for Inference and Learning in Predictive Coding Networks

    Authors: Beren Millidge, Yuhang Song, Tommaso Salvatori, Thomas Lukasiewicz, Rafal Bogacz

    Abstract: Predictive coding (PC) is an influential theory in computational neuroscience, which argues that the cortex forms unsupervised world models by implementing a hierarchical process of prediction error minimization. PC networks (PCNs) are trained in two phases. First, neural activities are updated to optimize the network's response to external stimuli. Second, synaptic weights are updated to consolid… ▽ More

    Submitted 3 August, 2022; v1 submitted 21 July, 2022; originally announced July 2022.

    Comments: 21/07/22 initial upload (finally); 03/08/22 revisions

  44. arXiv:2207.11683  [pdf, other

    eess.IV cs.CV cs.LG

    PCA: Semi-supervised Segmentation with Patch Confidence Adversarial Training

    Authors: Zihang Xu, Zhenghua Xu, Shuo Zhang, Thomas Lukasiewicz

    Abstract: Deep learning based semi-supervised learning (SSL) methods have achieved strong performance in medical image segmentation, which can alleviate doctors' expensive annotation by utilizing a large amount of unlabeled data. Unlike most existing semi-supervised learning methods, adversarial training based methods distinguish samples from different sources by learning the data distribution of the segmen… ▽ More

    Submitted 24 July, 2022; originally announced July 2022.

  45. arXiv:2207.04343  [pdf, other

    cs.CV cs.AI cs.CL

    Explaining Chest X-ray Pathologies in Natural Language

    Authors: Maxime Kayser, Cornelius Emde, Oana-Maria Camburu, Guy Parsons, Bartlomiej Papiez, Thomas Lukasiewicz

    Abstract: Most deep learning algorithms lack explanations for their predictions, which limits their deployment in clinical practice. Approaches to improve explainability, especially in medical imaging, have often been shown to convey limited information, be overly reassuring, or lack robustness. In this work, we introduce the task of generating natural language explanations (NLEs) to justify predictions mad… ▽ More

    Submitted 9 July, 2022; originally announced July 2022.

    Journal ref: MICCAI 2022

  46. arXiv:2207.01066  [pdf, other

    cs.LG cs.CV

    NP-Match: When Neural Processes meet Semi-Supervised Learning

    Authors: Jianfeng Wang, Thomas Lukasiewicz, Daniela Massiceti, Xiaolin Hu, Vladimir Pavlovic, Alexandros Neophytou

    Abstract: Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data. In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match. NP-Match is suited to this task for two reasons. Firstly, NP-Match implicitly compares data… ▽ More

    Submitted 3 July, 2022; originally announced July 2022.

    Comments: To appear at ICML 2022. The source codes are at https://github.com/Jianf-Wang/NP-Match

  47. arXiv:2206.09293  [pdf, other

    cs.CV

    Rethinking Bayesian Deep Learning Methods for Semi-Supervised Volumetric Medical Image Segmentation

    Authors: Jianfeng Wang, Thomas Lukasiewicz

    Abstract: Recently, several Bayesian deep learning methods have been proposed for semi-supervised medical image segmentation. Although they have achieved promising results on medical benchmarks, some problems are still existing. Firstly, their overall architectures belong to the discriminative models, and hence, in the early stage of training, they only use labeled data for training, which might make them o… ▽ More

    Submitted 18 June, 2022; originally announced June 2022.

    Comments: To appear at CVPR 2022, and the supplementary material can be found at the official site. The source codes are at https://github.com/Jianf-Wang/GBDL

  48. arXiv:2206.02629  [pdf, other

    cs.LG cs.AI q-bio.NC

    Backpropagation at the Infinitesimal Inference Limit of Energy-Based Models: Unifying Predictive Coding, Equilibrium Propagation, and Contrastive Hebbian Learning

    Authors: Beren Millidge, Yuhang Song, Tommaso Salvatori, Thomas Lukasiewicz, Rafal Bogacz

    Abstract: How the brain performs credit assignment is a fundamental unsolved problem in neuroscience. Many `biologically plausible' algorithms have been proposed, which compute gradients that approximate those computed by backpropagation (BP), and which operate in ways that more closely satisfy the constraints imposed by neural circuitry. Many such algorithms utilize the framework of energy-based models (EB… ▽ More

    Submitted 3 August, 2022; v1 submitted 31 May, 2022; originally announced June 2022.

    Comments: 31/05/22 initial upload; 22/06/22 change corresponding author; 03/08/22 revisions

  49. arXiv:2205.07234  [pdf, other

    cs.LG cs.AI

    Clinical outcome prediction under hypothetical interventions -- a representation learning framework for counterfactual reasoning

    Authors: Yikuan Li, Mohammad Mamouei, Shishir Rao, Abdelaali Hassaine, Dexter Canoy, Thomas Lukasiewicz, Kazem Rahimi, Gholamreza Salimi-Khorshidi

    Abstract: Most machine learning (ML) models are developed for prediction only; offering no option for causal interpretation of their predictions or parameters/properties. This can hamper the health systems' ability to employ ML models in clinical decision-making processes, where the need and desire for predicting outcomes under hypothetical investigations (i.e., counterfactual reasoning/explanation) is high… ▽ More

    Submitted 15 May, 2022; originally announced May 2022.

  50. Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text Correspondence

    Authors: Myeongjun Jang, Frank Mtumbuka, Thomas Lukasiewicz

    Abstract: The logical negation property (LNP), which implies generating different predictions for semantically opposite inputs, is an important property that a trustworthy language model must satisfy. However, much recent evidence shows that large-size pre-trained language models (PLMs) do not satisfy this property. In this paper, we perform experiments using probing tasks to assess PLM's LNP understanding.… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

    Comments: Accepted in the Findings of NAACL 2022