Pinned Loading
-
async_rlhf
async_rlhf PublicCode and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models
Python 12
-
elastic-reset
elastic-reset PublicCode and Experiments for "Language Model Alignment with Elastic Reset" (NeurIPS 2023)
Python 5
-
vwxyzjn/summarize_from_feedback_details
vwxyzjn/summarize_from_feedback_details Public -
emergent-compete
emergent-compete PublicCode for Emergent Communication under Competition (AAMAS 2021)
-
huggingface/trl
huggingface/trl PublicTrain transformer language models with reinforcement learning.
-
lecture-notes
lecture-notes PublicLaTeX lecture notes CS/ML courses at University of Waterloo and Universite de Montreal
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.