Location via proxy:   
[Report a bug]   [Manage cookies]                
Simon Ramstedt Profile Picture Simon Ramstedt Profile Picture Hover
Simon Ramstedt Name
AI Researcher. Previously PhD student at Mila and McGill. I spent my undergrad at TU Darmstadt, Master's at University of Montreal and spent summers at Microsoft Research and ElementAI.

Publications

  • Reinforcement Learning with Random Delays Simon Ramstedt, Yann Bouteiller, G. Beltrame, Christopher Pal & Jonathan Binas ICLR 2021 [arXiv:2010.02966] A low-bias, low-variance value estimator for environments with random action and observation delays. The estimator is used and evaluated with our new Delay-Correcting Actor-Critic algorithm.
  • Real-Time Reinforcement Learning Simon Ramstedt & Christopher Pal NeurIPS 2019 [arXiv:1911.04448] A new framework for Reinforcement Learning in which states and actions evolve simultaneously. It acknowledges that action selection takes time. We introduce the Real-Time Actor-Critic algorithm.

Projects

  • Robin VLM 2023 [github.com/cerc-aai/robin] Robin is a software suite to train vision-language models. We released data, training code and weights for an open VLM based on Mistral-7B and SigLIP, implemented using PyTorch and DeepSpeed and trained on eight A100 GPUs on the HessianAI computing cluster.
  • Uniton 2021 [github.com/rmst/uniton][Demo Video] Uniton is an asynchronous RPC framework for the Unity game engine and Python with the goal to instrumentalize Unity and make it more useful for non-game applications.
  • RTRL 2019 [github.com/rmst/rtrl] Code accompanying our Real-Time Reinforcement Learning paper with implementations of Real-Time Actor-Critic and Soft Actor-Critic in Python and Pytorch.
  • Avenue 2017-2019 [github.com/elementai/avenue] A fast, easy-to-use, high-fidelity car simulator based on the Unity game engine.
  • DDPG 2016 [github.com/rmst/ddpg] The first open reproduction of the Deep Deterministic Policy Gradient algorithm by Lillicrap, et al., 2015, in both MATLAB with manually-coded gradient computation and also the (at the time) brand new Tensorflow automatic differentiation framework.