Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
Past year
  • Any time
  • Past hour
  • Past 24 hours
  • Past week
  • Past month
  • Past year
All results
Nov 28, 2023 · We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon ...
Nov 28, 2023 · We have just released a paper that allows us to extract several megabytes of ChatGPT's training data for about two hundred dollars. (Language models ...
Nov 29, 2023 · Prompt-based training data extraction techniques rely on the ability of LLMs to generate text that is similar to the text they were trained on. By carefully ...
Dec 3, 2023 · [D] Paper Analysis - Scalable Extraction of Training Data from (Production) Language Models (Video Walkthrough) ... Researchers were able to get giant amounts of ...
Aug 11, 2023 · We begin our paper by extracting training images from large, pre-trained, high-resolution diffusion models. 4.1 Defining Image Memorization. The literature on ...
Nov 29, 2023 · The point of the paper is that LLMs memorize an insane amount of training data and, with some massaging, can be made to output it verbatim.
Aug 29, 2023 · In this paper, the authors come up with and investigate the efficacy of a training data extraction attack on large language models. To mitigate the potential ...
Dec 27, 2023 · The researchers successfully demonstrated that substantial amounts of training data can be extracted from language models. They found that public, semi-private, ...
Nov 29, 2023 · The best part is it preserves the copyright notices from the training data. So we know that the model was obviously trained on copywritten data the legal ...