PhD student at Computer Science
McGill University
Canada
-
Mila
- Montreal, Canada
- https://saminyeasar.github.io/
- @YeasarArnob
Highlights
- Pro
Pinned Loading
-
unpaired_rlhf
unpaired_rlhf PublicForked from sahandrez/unpaired_rlhf
Reinforcement Learning from Human Feedback (RLHF) with Unpaired Preferences
Python
-
-
Offline-Reinforcement-Learning-Algorithms
Offline-Reinforcement-Learning-Algorithms PublicPyTorch Implementation of Offline Reinforcement Learning algorithms
-
Off_Policy_Adversarial_Inverse_Reinforcement_Learning
Off_Policy_Adversarial_Inverse_Reinforcement_Learning PublicImplementation of Off Policy Adversarial Inverse Reinforcement Learning
-
PyTorch-implementation-DICE-algorithms
PyTorch-implementation-DICE-algorithms PublicPyTorch-implementation-DICE-algorithms
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.