default search action
Peter Hase
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [b1]Peter Hase:
Interpretable and Controllable Language Models. University of North Carolina, Chapel Hill, USA, 2024 - [j2]Prateek Yadav, Peter Hase, Mohit Bansal:
INSPIRE: Incorporating Diverse Feature Preferences in Recourse. Trans. Mach. Learn. Res. 2024 (2024) - [c16]Peter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe:
The Unreasonable Effectiveness of Easy Training Data for Hard Tasks. ACL (1) 2024: 7002-7024 - [c15]Vaidehi Patil, Peter Hase, Mohit Bansal:
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks. ICLR 2024 - [i26]Peter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe:
The Unreasonable Effectiveness of Easy Training Data for Hard Tasks. CoRR abs/2401.06751 (2024) - [i25]Sijia Liu, Yuanshun Yao, Jinghan Jia, Stephen Casper, Nathalie Baracaldo, Peter Hase, Xiaojun Xu, Yuguang Yao, Hang Li, Kush R. Varshney, Mohit Bansal, Sanmi Koyejo, Yang Liu:
Rethinking Machine Unlearning for Large Language Models. CoRR abs/2402.08787 (2024) - [i24]Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, José Hernández-Orallo, Lewis Hammond, Eric J. Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob N. Foerster, Florian Tramèr, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger:
Foundational Challenges in Assuring Alignment and Safety of Large Language Models. CoRR abs/2404.09932 (2024) - [i23]Elias Stengel-Eskin, Peter Hase, Mohit Bansal:
LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models. CoRR abs/2405.21028 (2024) - [i22]Thomas Hofweber, Peter Hase, Elias Stengel-Eskin, Mohit Bansal:
Are language models rational? The case of coherence norms and belief revision. CoRR abs/2406.03442 (2024) - [i21]Peter Hase, Thomas Hofweber, Xiang Zhou, Elias Stengel-Eskin, Mohit Bansal:
Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs? CoRR abs/2406.19354 (2024) - [i20]Swarnadeep Saha, Archiki Prasad, Justin Chih-Yao Chen, Peter Hase, Elias Stengel-Eskin, Mohit Bansal:
System-1.x: Learning to Balance Fast and Slow Planning with Language Models. CoRR abs/2407.14414 (2024) - [i19]Elias Stengel-Eskin, Peter Hase, Mohit Bansal:
Teaching Models to Balance Resisting and Accepting Persuasion. CoRR abs/2410.14596 (2024) - 2023
- [j1]Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Tong Wang, Samuel Marks, Charbel-Raphaël Ségerie, Micah Carroll, Andi Peng, Phillip J. K. Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca D. Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell:
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. Trans. Mach. Learn. Res. 2023 (2023) - [c14]Peter Hase, Mona T. Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer:
Methods for Measuring, Updating, and Visualizing Factual Beliefs in Language Models. EACL 2023: 2706-2723 - [c13]Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal:
GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models. EACL 2023: 3827-3846 - [c12]Swarnadeep Saha, Shiyue Zhang, Peter Hase, Mohit Bansal:
Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees. ICLR 2023 - [c11]Peter Hase, Mohit Bansal, Been Kim, Asma Ghandeharioun:
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models. NeurIPS 2023 - [c10]Swarnadeep Saha, Peter Hase, Mohit Bansal:
Can Language Models Teach? Teacher Explanations Improve Student Performance via Personalization. NeurIPS 2023 - [c9]Zhuofan Ying, Peter Hase, Mohit Bansal:
Adaptive Contextual Perception: How To Generalize To New Backgrounds and Ambiguous Objects. NeurIPS 2023 - [i18]Peter Hase, Mohit Bansal, Been Kim, Asma Ghandeharioun:
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models. CoRR abs/2301.04213 (2023) - [i17]Zhuofan Ying, Peter Hase, Mohit Bansal:
Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects. CoRR abs/2306.05963 (2023) - [i16]Swarnadeep Saha, Peter Hase, Mohit Bansal:
Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind. CoRR abs/2306.09299 (2023) - [i15]Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Tong Wang, Samuel Marks, Charbel-Raphaël Ségerie, Micah Carroll, Andi Peng, Phillip J. K. Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca D. Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell:
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. CoRR abs/2307.15217 (2023) - [i14]Vaidehi Patil, Peter Hase, Mohit Bansal:
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks. CoRR abs/2309.17410 (2023) - 2022
- [c8]Swarnadeep Saha, Peter Hase, Nazneen Rajani, Mohit Bansal:
Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations. EMNLP 2022: 2121-2131 - [c7]Zhuofan Ying, Peter Hase, Mohit Bansal:
VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives. NeurIPS 2022 - [i13]Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal:
GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models. CoRR abs/2203.07281 (2022) - [i12]Zhuofan Ying, Peter Hase, Mohit Bansal:
VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives. CoRR abs/2206.11212 (2022) - [i11]Swarnadeep Saha, Shiyue Zhang, Peter Hase, Mohit Bansal:
Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees. CoRR abs/2209.10492 (2022) - [i10]Swarnadeep Saha, Peter Hase, Nazneen Rajani, Mohit Bansal:
Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations. CoRR abs/2211.07517 (2022) - 2021
- [c6]Han Guo, Nazneen Rajani, Peter Hase, Mohit Bansal, Caiming Xiong:
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging. EMNLP (1) 2021: 10333-10350 - [c5]Peter Hase, Harry Xie, Mohit Bansal:
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations. NeurIPS 2021: 3650-3666 - [i9]Peter Hase, Mohit Bansal:
When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data. CoRR abs/2102.02201 (2021) - [i8]Peter Hase, Harry Xie, Mohit Bansal:
Search Methods for Sufficient, Socially-Aligned Feature Importance Explanations with In-Distribution Counterfactuals. CoRR abs/2106.00786 (2021) - [i7]Prateek Yadav, Peter Hase, Mohit Bansal:
Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions. CoRR abs/2111.01235 (2021) - [i6]Peter Hase, Mona T. Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer:
Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs. CoRR abs/2111.13654 (2021) - 2020
- [c4]Peter Hase, Mohit Bansal:
Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? ACL 2020: 5540-5552 - [c3]Peter Hase, Shiyue Zhang, Harry Xie, Mohit Bansal:
Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? EMNLP (Findings) 2020: 4351-4367 - [i5]Peter Hase, Mohit Bansal:
Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? CoRR abs/2005.01831 (2020) - [i4]Peter Hase, Shiyue Zhang, Harry Xie, Mohit Bansal:
Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? CoRR abs/2010.04119 (2020) - [i3]Han Guo, Nazneen Fatema Rajani, Peter Hase, Mohit Bansal, Caiming Xiong:
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging. CoRR abs/2012.15781 (2020)
2010 – 2019
- 2019
- [c2]Peter Hase, Chaofan Chen, Oscar Li, Cynthia Rudin:
Interpretable Image Recognition with Hierarchical Prototypes. HCOMP 2019: 32-40 - [i2]Peter Hase, Chaofan Chen, Oscar Li, Cynthia Rudin:
Interpretable Image Recognition with Hierarchical Prototypes. CoRR abs/1906.10651 (2019) - 2018
- [i1]John Benhart, Tianlin Duan, Peter Hase, Liuyi Zhu, Cynthia Rudin:
Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation. CoRR abs/1811.05067 (2018)
1990 – 1999
- 1997
- [c1]Holger Husemann, Jörg Petersen, Christian Kanty, Hans-Dieter Kochs, Peter Hase:
An User Adaptive Navigation Metaphor to Connect and Rate the Coherence of Terms and Complex Objects. Hypertext 1997: 214-215
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-28 20:31 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint