Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–39 of 39 results for author: Schramowski, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03391  [pdf, other

    cs.CR cs.AI cs.CL

    Soft Begging: Modular and Efficient Shielding of LLMs against Prompt Injection and Jailbreaking based on Prompt Tuning

    Authors: Simon Ostermann, Kevin Baum, Christoph Endres, Julia Masloh, Patrick Schramowski

    Abstract: Prompt injection (both direct and indirect) and jailbreaking are now recognized as significant issues for large language models (LLMs), particularly due to their potential for harm in application-integrated contexts. This extended abstract explores a novel approach to protecting LLMs from such attacks, termed "soft begging." This method involves training soft prompts to counteract the effects of c… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2406.19223  [pdf, other

    cs.CL cs.AI cs.LG

    T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings

    Authors: Björn Deiseroth, Manuel Brack, Patrick Schramowski, Kristian Kersting, Samuel Weinbach

    Abstract: Tokenizers are crucial for encoding information in Large Language Models, but their development has recently stagnated, and they contain inherent weaknesses. Major limitations include computational overhead, ineffective vocabulary use, and unnecessarily large embedding and head layers. Additionally, their performance is biased towards a reference corpus, leading to reduced effectiveness for underr… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.05113  [pdf, other

    cs.CV cs.AI cs.LG

    LLavaGuard: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment

    Authors: Lukas Helff, Felix Friedrich, Manuel Brack, Kristian Kersting, Patrick Schramowski

    Abstract: We introduce LlavaGuard, a family of VLM-based safeguard models, offering a versatile framework for evaluating the safety compliance of visual content. Specifically, we designed LlavaGuard for dataset annotation and generative model safeguarding. To this end, we collected and annotated a high-quality visual dataset incorporating a broad safety taxonomy, which we use to tune VLMs on context-aware s… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Project page at https://ml-research.github.io/human-centered-genai/projects/llavaguard/index.html

  4. arXiv:2404.12241  [pdf, other

    cs.CL cs.AI

    Introducing v0.5 of the AI Safety Benchmark from MLCommons

    Authors: Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Max Bartolo, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller , et al. (75 additional authors not shown)

    Abstract: This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-pu… ▽ More

    Submitted 13 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  5. arXiv:2404.08676  [pdf, other

    cs.CL cs.CY cs.LG

    ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming

    Authors: Simone Tedeschi, Felix Friedrich, Patrick Schramowski, Kristian Kersting, Roberto Navigli, Huu Nguyen, Bo Li

    Abstract: When building Large Language Models (LLMs), it is paramount to bear safety in mind and protect them with guardrails. Indeed, LLMs should never generate content promoting or normalizing harmful, illegal, or unethical behavior that may contribute to harm to individuals or society. This principle applies to both normal and adversarial use. In response, we introduce ALERT, a large-scale benchmark to a… ▽ More

    Submitted 24 June, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

    Comments: 17 pages, preprint

    MSC Class: I.2

  6. arXiv:2402.14123  [pdf, other

    cs.LG cs.AI cs.CV

    DeiSAM: Segment Anything with Deictic Prompting

    Authors: Hikaru Shindo, Manuel Brack, Gopika Sudhakaran, Devendra Singh Dhami, Patrick Schramowski, Kristian Kersting

    Abstract: Large-scale, pre-trained neural networks have demonstrated strong capabilities in various tasks, including zero-shot image segmentation. To identify concrete objects in complex scenes, humans instinctively rely on deictic descriptions in natural language, i.e., referring to something depending on the context such as "The object that is on the desk and behind the cup.". However, deep learning appro… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Preprint

  7. arXiv:2401.16092  [pdf, other

    cs.CL cs.CY cs.LG

    Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You

    Authors: Felix Friedrich, Katharina Hämmerl, Patrick Schramowski, Manuel Brack, Jindrich Libovicky, Kristian Kersting, Alexander Fraser

    Abstract: Text-to-image generation models have recently achieved astonishing results in image quality, flexibility, and text alignment, and are consequently employed in a fast-growing number of applications. Through improvements in multilingual abilities, a larger community now has access to this technology. However, our results show that multilingual models suffer from significant gender biases just as mon… ▽ More

    Submitted 15 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  8. arXiv:2311.16711  [pdf, other

    cs.CV cs.AI cs.HC cs.LG

    LEDITS++: Limitless Image Editing using Text-to-Image Models

    Authors: Manuel Brack, Felix Friedrich, Katharina Kornmeier, Linoy Tsaban, Patrick Schramowski, Kristian Kersting, Apolinário Passos

    Abstract: Text-to-image diffusion models have recently received increasing interest for their astonishing ability to produce high-fidelity images from solely text inputs. Subsequent research efforts aim to exploit and apply their capabilities to real image editing. However, existing image-to-image methods are often inefficient, imprecise, and of limited versatility. They either require time-consuming finetu… ▽ More

    Submitted 25 June, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) The project page is available at https://leditsplusplus-project.static.hf.space

  9. arXiv:2311.01544  [pdf, other

    cs.CL cs.LG

    Divergent Token Metrics: Measuring degradation to prune away LLM components -- and optimize quantization

    Authors: Björn Deiseroth, Max Meuer, Nikolas Gritsch, Constantin Eichenberg, Patrick Schramowski, Matthias Aßenmacher, Kristian Kersting

    Abstract: Large Language Models (LLMs) have reshaped natural language processing with their impressive capabilities. However, their ever-increasing size has raised concerns about their effective deployment and the need for LLM compression. This study introduces the Divergent Token Metrics (DTMs), a novel approach to assessing compressed LLMs, addressing the limitations of traditional perplexity or accuracy… ▽ More

    Submitted 3 April, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

  10. arXiv:2309.11575  [pdf, other

    cs.CV cs.AI cs.LG

    Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge

    Authors: Manuel Brack, Patrick Schramowski, Kristian Kersting

    Abstract: Text-conditioned image generation models have recently achieved astonishing image quality and alignment results. Consequently, they are employed in a fast-growing number of applications. Since they are highly data-driven, relying on billion-sized datasets randomly scraped from the web, they also produce unsafe content. As a contribution to the Adversarial Nibbler challenge, we distill a large set… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  11. arXiv:2305.18398  [pdf, other

    cs.CV cs.AI cs.LG

    Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness?

    Authors: Manuel Brack, Felix Friedrich, Patrick Schramowski, Kristian Kersting

    Abstract: Text-conditioned image generation models have recently achieved astonishing results in image quality and text alignment and are consequently employed in a fast-growing number of applications. Since they are highly data-driven, relying on billion-sized datasets randomly scraped from the web, they also reproduce inappropriate human behavior. Specifically, we demonstrate inappropriate degeneration on… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

  12. arXiv:2305.15296  [pdf, other

    cs.CV cs.AI cs.LG

    MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

    Authors: Marco Bellagente, Manuel Brack, Hannah Teufel, Felix Friedrich, Björn Deiseroth, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Koen Oostermeijer, Andres Felipe Cruz-Salinas, Patrick Schramowski, Kristian Kersting, Samuel Weinbach

    Abstract: The recent popularity of text-to-image diffusion models (DM) can largely be attributed to the intuitive interface they provide to users. The intended generation can be expressed in natural language, with the model producing faithful interpretations of text prompts. However, expressing complex or nuanced ideas in text alone can be difficult. To ease image generation, we propose MultiFusion that all… ▽ More

    Submitted 20 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Proceedings of Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems (NeurIPS)

  13. arXiv:2303.09289  [pdf, other

    cs.LG cs.CR cs.CV

    Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations

    Authors: Lukas Struppek, Dominik Hintersdorf, Felix Friedrich, Manuel Brack, Patrick Schramowski, Kristian Kersting

    Abstract: Neural network-based image classifiers are powerful tools for computer vision tasks, but they inadvertently reveal sensitive attribute information about their classes, raising concerns about their privacy. To investigate this privacy leakage, we introduce the first Class Attribute Inference Attack (CAIA), which leverages recent advances in text-to-image synthesis to infer sensitive attributes of i… ▽ More

    Submitted 13 June, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: 46 pages, 37 figures, 5 tables

  14. arXiv:2302.10893  [pdf, other

    cs.LG cs.AI cs.CV cs.CY cs.HC

    Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness

    Authors: Felix Friedrich, Manuel Brack, Lukas Struppek, Dominik Hintersdorf, Patrick Schramowski, Sasha Luccioni, Kristian Kersting

    Abstract: Generative AI models have recently achieved astonishing results in quality and are consequently employed in a fast-growing number of applications. However, since they are highly data-driven, relying on billion-sized datasets randomly scraped from the internet, they also suffer from degenerated and biased human behavior, as we demonstrate. In fact, they may even reinforce such biases. To not only u… ▽ More

    Submitted 17 July, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

  15. arXiv:2301.12247  [pdf, other

    cs.CV cs.AI cs.LG

    SEGA: Instructing Text-to-Image Models using Semantic Guidance

    Authors: Manuel Brack, Felix Friedrich, Dominik Hintersdorf, Lukas Struppek, Patrick Schramowski, Kristian Kersting

    Abstract: Text-to-image diffusion models have recently received a lot of interest for their astonishing ability to produce high-fidelity images from text only. However, achieving one-shot generation that aligns with the user's intent is nearly impossible, yet small changes to the input prompt often result in very different images. This leaves the user with little semantic control. To put the user in control… ▽ More

    Submitted 2 November, 2023; v1 submitted 28 January, 2023; originally announced January 2023.

    Comments: arXiv admin note: text overlap with arXiv:2212.06013 Proceedings of the Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems (NeurIPS)

  16. arXiv:2301.08110  [pdf, other

    cs.LG cs.AI

    AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation

    Authors: Björn Deiseroth, Mayukh Deb, Samuel Weinbach, Manuel Brack, Patrick Schramowski, Kristian Kersting

    Abstract: Generative transformer models have become increasingly complex, with large numbers of parameters and the ability to process multiple input modalities. Current methods for explaining their predictions are resource-intensive. Most crucially, they require prohibitively large amounts of extra memory, since they rely on backpropagation which allocates almost twice as much GPU memory as the forward pass… ▽ More

    Submitted 5 November, 2023; v1 submitted 19 January, 2023; originally announced January 2023.

  17. arXiv:2212.06013  [pdf, other

    cs.CV cs.AI cs.LG

    The Stable Artist: Steering Semantics in Diffusion Latent Space

    Authors: Manuel Brack, Patrick Schramowski, Felix Friedrich, Dominik Hintersdorf, Kristian Kersting

    Abstract: Large, text-conditioned generative diffusion models have recently gained a lot of attention for their impressive performance in generating high-fidelity images from text alone. However, achieving high-quality results is almost unfeasible in a one-shot fashion. On the contrary, text-guided image generation involves the user making many slight changes to inputs in order to iteratively carve out the… ▽ More

    Submitted 31 May, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

    Comments: This is a report of preliminary results. A full version of the paper is available at: arXiv:2301.12247

  18. arXiv:2211.07733  [pdf, other

    cs.CL

    Speaking Multiple Languages Affects the Moral Bias of Language Models

    Authors: Katharina Hämmerl, Björn Deiseroth, Patrick Schramowski, Jindřich Libovický, Constantin A. Rothkopf, Alexander Fraser, Kristian Kersting

    Abstract: Pre-trained multilingual language models (PMLMs) are commonly used when dealing with data from multiple languages and cross-lingual transfer. However, PMLMs are trained on varying amounts of data for each language. In practice this means their performance is often much better on English than many other languages. We explore to what extent this also applies to moral norms. Do the models capture mor… ▽ More

    Submitted 1 June, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: To appear in ACL Findings 2023

  19. arXiv:2211.05105  [pdf, other

    cs.CV cs.AI cs.LG

    Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models

    Authors: Patrick Schramowski, Manuel Brack, Björn Deiseroth, Kristian Kersting

    Abstract: Text-conditioned image generation models have recently achieved astonishing results in image quality and text alignment and are consequently employed in a fast-growing number of applications. Since they are highly data-driven, relying on billion-sized datasets randomly scraped from the internet, they also suffer, as we demonstrate, from degenerated and biased human behavior. In turn, they may even… ▽ More

    Submitted 26 April, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

    Comments: Proceedings of the 22nd IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

  20. arXiv:2210.10332  [pdf, other

    cs.CL cs.AI cs.HC

    Revision Transformers: Instructing Language Models to Change their Values

    Authors: Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian Kersting

    Abstract: Current transformer language models (LM) are large-scale models with billions of parameters. They have been shown to provide high performances on a variety of tasks but are also prone to shortcut learning and bias. Addressing such incorrect model behavior via parameter adjustments is very costly. This is particularly problematic for updating dynamic concepts, such as moral values, which vary cultu… ▽ More

    Submitted 25 July, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

  21. arXiv:2210.08402  [pdf, other

    cs.CV cs.AI cs.LG

    LAION-5B: An open large-scale dataset for training next generation image-text models

    Authors: Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, Jenia Jitsev

    Abstract: Groundbreaking language-vision architectures like CLIP and DALL-E proved the utility of training on large amounts of noisy image-text data, without relying on expensive accurate labels used in standard vision unimodal supervised learning. The resulting models showed capabilities of strong text-guided image generation and transfer to downstream tasks, while performing remarkably at zero-shot classi… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022), Track on Datasets and Benchmarks. OpenReview: https://openreview.net/forum?id=M3Y74vmsMcY

  22. arXiv:2209.08891  [pdf, other

    cs.CV cs.AI cs.CY cs.LG

    Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis

    Authors: Lukas Struppek, Dominik Hintersdorf, Felix Friedrich, Manuel Brack, Patrick Schramowski, Kristian Kersting

    Abstract: Models for text-to-image synthesis, such as DALL-E~2 and Stable Diffusion, have recently drawn a lot of interest from academia and the general public. These models are capable of producing high-quality images that depict a variety of concepts and styles when conditioned on textual descriptions. However, these models adopt cultural characteristics associated with specific Unicode scripts from their… ▽ More

    Submitted 9 January, 2024; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: Published in the Journal of Artificial Intelligence Research (JAIR)

    Journal ref: Journal of Artificial Intelligence Research (JAIR), Vol. 78 (2023)

  23. arXiv:2209.07341  [pdf, other

    cs.LG cs.CR cs.CV

    Does CLIP Know My Face?

    Authors: Dominik Hintersdorf, Lukas Struppek, Manuel Brack, Felix Friedrich, Patrick Schramowski, Kristian Kersting

    Abstract: With the rise of deep learning in various applications, privacy concerns around the protection of training data has become a critical area of research. Whereas prior studies have focused on privacy risks in single-modal models, we introduce a novel method to assess privacy for multi-modal models, specifically vision-language models like CLIP. The proposed Identity Inference Attack (IDIA) reveals w… ▽ More

    Submitted 30 May, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: 15 pages, 6 figures

  24. arXiv:2208.13518  [pdf, other

    cs.AI cs.CL cs.CV cs.LO cs.SC

    LogicRank: Logic Induced Reranking for Generative Text-to-Image Systems

    Authors: Björn Deiseroth, Patrick Schramowski, Hikaru Shindo, Devendra Singh Dhami, Kristian Kersting

    Abstract: Text-to-image models have recently achieved remarkable success with seemingly accurate samples in photo-realistic quality. However as state-of-the-art language models still struggle evaluating precise statements consistently, so do language model based image generation processes. In this work we showcase problems of state-of-the-art text-to-image models like DALL-E with generating accurate samples… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.

  25. arXiv:2208.08241  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.HC

    ILLUME: Rationalizing Vision-Language Models through Human Interactions

    Authors: Manuel Brack, Patrick Schramowski, Björn Deiseroth, Kristian Kersting

    Abstract: Bootstrapping from pre-trained language models has been proven to be an efficient approach for building vision-language models (VLM) for tasks such as image captioning or visual question answering. However, outputs of these models rarely align with user's rationales for specific answers. In order to improve this alignment and reinforce commonsense reasons, we propose a tuning paradigm based on hum… ▽ More

    Submitted 31 May, 2023; v1 submitted 17 August, 2022; originally announced August 2022.

    Comments: Proceedings of the 40th International Conference on Machine Learning (ICML), 2023

  26. arXiv:2203.09904  [pdf, ps, other

    cs.CL

    Do Multilingual Language Models Capture Differing Moral Norms?

    Authors: Katharina Hämmerl, Björn Deiseroth, Patrick Schramowski, Jindřich Libovický, Alexander Fraser, Kristian Kersting

    Abstract: Massively multilingual sentence representations are trained on large corpora of uncurated data, with a very imbalanced proportion of languages included in the training. This may cause the models to grasp cultural values including moral judgments from the high-resource languages and impose them on the low-resource languages. The lack of data in certain languages can also lead to developing random a… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

  27. arXiv:2203.03668  [pdf, other

    cs.LG cs.AI cs.HC

    A Typology for Exploring the Mitigation of Shortcut Behavior

    Authors: Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian Kersting

    Abstract: As machine learning models become increasingly larger, trained weakly supervised on large, possibly uncurated data sets, it becomes increasingly important to establish mechanisms for inspecting, interacting, and revising models to mitigate learning shortcuts and guarantee their learned knowledge is aligned with human knowledge. The recently proposed XIL framework was developed for this purpose, an… ▽ More

    Submitted 14 March, 2024; v1 submitted 4 March, 2022; originally announced March 2022.

  28. arXiv:2202.06675  [pdf, other

    cs.AI cs.CV cs.CY

    Can Machines Help Us Answering Question 16 in Datasheets, and In Turn Reflecting on Inappropriate Content?

    Authors: Patrick Schramowski, Christopher Tauchmann, Kristian Kersting

    Abstract: Large datasets underlying much of current machine learning raise serious issues concerning inappropriate content such as offensive, insulting, threatening, or might otherwise cause anxiety. This calls for increased dataset documentation, e.g., using datasheets. They, among other topics, encourage to reflect on the composition of the datasets. So far, this documentation, however, is done manually a… ▽ More

    Submitted 14 July, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: arXiv admin note: text overlap with arXiv:2110.04222

  29. arXiv:2112.02290  [pdf, other

    cs.CV cs.LG

    Interactive Disentanglement: Learning Concepts by Interacting with their Prototype Representations

    Authors: Wolfgang Stammer, Marius Memmel, Patrick Schramowski, Kristian Kersting

    Abstract: Learning visual concepts from raw images without strong supervision is a challenging task. In this work, we show the advantages of prototype representations for understanding and revising the latent space of neural concept learners. For this purpose, we introduce interactive Concept Swapping Networks (iCSNs), a novel framework for learning concept-grounded representations via weak supervision and… ▽ More

    Submitted 29 March, 2022; v1 submitted 4 December, 2021; originally announced December 2021.

    Comments: To be published in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

  30. arXiv:2110.04222  [pdf, other

    cs.CV cs.AI

    Inferring Offensiveness In Images From Natural Language Supervision

    Authors: Patrick Schramowski, Kristian Kersting

    Abstract: Probing or fine-tuning (large-scale) pre-trained models results in state-of-the-art performance for many NLP tasks and, more recently, even for computer vision tasks when combined with image data. Unfortunately, these approaches also entail severe risks. In particular, large image datasets automatically scraped from the web may contain derogatory terms as categories and offensive images, and may a… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

  31. arXiv:2110.02058  [pdf, other

    cs.CL cs.AI cs.LG

    Interactively Providing Explanations for Transformer Language Models

    Authors: Felix Friedrich, Patrick Schramowski, Christopher Tauchmann, Kristian Kersting

    Abstract: Transformer language models are state of the art in a multitude of NLP tasks. Despite these successes, their opaqueness remains problematic. Recent methods aiming to provide interpretability and explainability to black-box models primarily focus on post-hoc explanations of (sometimes spurious) input-output correlations. Instead, we emphasize using prototype networks directly incorporated into the… ▽ More

    Submitted 11 March, 2022; v1 submitted 2 September, 2021; originally announced October 2021.

  32. arXiv:2103.11790  [pdf, other

    cs.CL cs.CY

    Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do

    Authors: Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting

    Abstract: Artificial writing is permeating our lives due to recent advances in large-scale, transformer-based language models (LMs) such as BERT, its variants, GPT-2/3, and others. Using them as pre-trained models and fine-tuning them for specific tasks, researchers have extended state of the art for many NLP tasks and shown that they capture not only linguistic knowledge but also retain general knowledge i… ▽ More

    Submitted 14 February, 2022; v1 submitted 8 March, 2021; originally announced March 2021.

  33. arXiv:2102.09407  [pdf, other

    cs.LG

    Adaptive Rational Activations to Boost Deep Reinforcement Learning

    Authors: Quentin Delfosse, Patrick Schramowski, Martin Mundt, Alejandro Molina, Kristian Kersting

    Abstract: Latest insights from biology show that intelligence not only emerges from the connections between neurons but that individual neurons shoulder more computational responsibility than previously anticipated. This perspective should be critical in the context of constantly changing distinct reinforcement learning environments, yet current approaches still primarily employ static activation functions.… ▽ More

    Submitted 16 March, 2024; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: Main paper: 9 pages, References: 4 pages, Appendix: 11 pages. Main paper: 5 figures, Appendix: 6 figures, 6 tables. Rational Activation Functions repository: https://github.com/k4ntz/activation-functions Rational Reinforcement Learning: https://github.com/ml-research/rational_rl

  34. arXiv:2011.12854  [pdf, other

    cs.LG cs.AI

    Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations

    Authors: Wolfgang Stammer, Patrick Schramowski, Kristian Kersting

    Abstract: Most explanation methods in deep learning map importance estimates for a model's prediction back to the original input space. These "visual" explanations are often insufficient, as the model's actual concept remains elusive. Moreover, without insights into the model's semantic concept, it is difficult -- if not impossible -- to intervene on the model's behavior via its explanations, called Explana… ▽ More

    Submitted 21 June, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, p. 3619-3629

  35. Alfie: An Interactive Robot with a Moral Compass

    Authors: Cigdem Turan, Patrick Schramowski, Constantin Rothkopf, Kristian Kersting

    Abstract: This work introduces Alfie, an interactive robot that is capable of answering moral (deontological) questions of a user. The interaction of Alfie is designed in a way in which the user can offer an alternative answer when the user disagrees with the given answer so that Alfie can learn from its interactions. Alfie's answers are based on a sentence embedding model that uses state-of-the-art languag… ▽ More

    Submitted 11 September, 2020; originally announced September 2020.

  36. arXiv:2001.05371  [pdf, other

    cs.LG cs.AI stat.ML

    Making deep neural networks right for the right scientific reasons by interacting with their explanations

    Authors: Patrick Schramowski, Wolfgang Stammer, Stefano Teso, Anna Brugger, Xiaoting Shao, Hans-Georg Luigs, Anne-Katrin Mahlein, Kristian Kersting

    Abstract: Deep neural networks have shown excellent performances in many real-world applications. Unfortunately, they may show "Clever Hans"-like behavior -- making use of confounding factors within datasets -- to achieve high performance. In this work, we introduce the novel learning setting of "explanatory interactive learning" (XIL) and illustrate its benefits on a plant phenotyping research task. XIL ad… ▽ More

    Submitted 5 March, 2024; v1 submitted 15 January, 2020; originally announced January 2020.

    Comments: arXiv admin note: text overlap with arXiv:1805.08578

  37. arXiv:1912.05238  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    BERT has a Moral Compass: Improvements of ethical and moral values of machines

    Authors: Patrick Schramowski, Cigdem Turan, Sophie Jentzsch, Constantin Rothkopf, Kristian Kersting

    Abstract: Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? Jentzsch et al.(2019) showed that applying machine learning to human texts can extract deontological ethical reasoning about "right" and "wrong" conduct by calculating a moral bias score on a sentence level using… ▽ More

    Submitted 11 December, 2019; originally announced December 2019.

  38. arXiv:1907.06732  [pdf, other

    cs.LG cs.NE

    Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks

    Authors: Alejandro Molina, Patrick Schramowski, Kristian Kersting

    Abstract: The performance of deep network learning strongly depends on the choice of the non-linear activation function associated with each neuron. However, deciding on the best activation is non-trivial, and the choice depends on the architecture, hyper-parameters, and even on the dataset. Typically these activations are fixed by hand before training. Here, we demonstrate how to eliminate the reliance on… ▽ More

    Submitted 4 February, 2020; v1 submitted 15 July, 2019; originally announced July 2019.

    Comments: 17 Pages, 8 Figures

  39. arXiv:1803.04300  [pdf, other

    cs.LG stat.ML

    Neural Conditional Gradients

    Authors: Patrick Schramowski, Christian Bauckhage, Kristian Kersting

    Abstract: The move from hand-designed to learned optimizers in machine learning has been quite successful for gradient-based and -free optimizers. When facing a constrained problem, however, maintaining feasibility typically requires a projection step, which might be computationally expensive and not differentiable. We show how the design of projection-free convex optimization algorithms can be cast as a le… ▽ More

    Submitted 30 July, 2018; v1 submitted 12 March, 2018; originally announced March 2018.

    Comments: arXiv admin note: text overlap with arXiv:1610.05120 by other authors