default search action

combined dblp search
author search
venue search
publication search

ask others

Bernardo Ávila Pires

Bernardo A. Pires

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[c14]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/CalandrielloGMR24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/CalandrielloGMR24
Daniele Calandriello, Zhaohan Daniel Guo, Rémi Munos, Mark Rowland, Yunhao Tang, Bernardo Ávila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot:
Human Alignment of Large Language Models through Online Preference Optimisation. ICML 2024
[c13]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/TangGZCMRRVPP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangGZCMRRVPP24
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Rémi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Ávila Pires, Bilal Piot:
Generalized Preference Optimization: A Unified Approach to Offline Alignment. ICML 2024
[i20]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-05749
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-05749
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Rémi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Ávila Pires, Bilal Piot:
Generalized Preference Optimization: A Unified Approach to Offline Alignment. CoRR abs/2402.05749 (2024)
[i19]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-05766
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-05766
Yunhao Tang, Mark Rowland, Rémi Munos, Bernardo Ávila Pires, Will Dabney:
Off-policy Distributional Q(λ): Distributional RL without Importance Sampling. CoRR abs/2402.05766 (2024)
[i18]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-08635
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-08635
Daniele Calandriello, Daniel Guo, Rémi Munos, Mark Rowland, Yunhao Tang, Bernardo Ávila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot:
Human Alignment of Large Language Models through Online Preference Optimisation. CoRR abs/2403.08635 (2024)
[i17]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-08448
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-08448
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Yuan Cao, Eugene Tarassov, Rémi Munos, Bernardo Ávila Pires, Michal Valko, Yong Cheng, Will Dabney:
Understanding the performance gap between online and offline alignment algorithms. CoRR abs/2405.08448 (2024)
[i16]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-19107
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-19107
Pierre Harvey Richemond, Yunhao Tang, Daniel Guo, Daniele Calandriello, Mohammad Gheshlaghi Azar, Rafael Rafailov, Bernardo Ávila Pires, Eugene Tarassov, Lucas Spangher, Will Ellsworth, Aliaksei Severyn, Jonathan Mallinson, Lior Shani, Gil Shamir, Rishabh Joshi, Tianqi Liu, Rémi Munos, Bilal Piot:
Offline Regularised Reinforcement Learning for Large Language Models Alignment. CoRR abs/2405.19107 (2024)
[i15]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-02035
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-02035
Khimya Khetarpal, Zhaohan Daniel Guo, Bernardo Ávila Pires, Yunhao Tang, Clare Lyle, Mark Rowland, Nicolas Heess, Diana Borsa, Arthur Guez, Will Dabney:
A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning. CoRR abs/2406.02035 (2024)
2023
[c12]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/LyleZNPPD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/LyleZNPPD23
Clare Lyle, Zeyu Zheng, Evgenii Nikishin, Bernardo Ávila Pires, Razvan Pascanu, Will Dabney:
Understanding Plasticity in Neural Networks. ICML 2023: 23190-23211
[c11]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/TangGRPCMRALL0T23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangGRPCMRALL0T23
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. ICML 2023: 33632-33656
[c10]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/TangKRHMPV23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangKRHMPV23
Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Rémi Munos, Bernardo Ávila Pires, Michal Valko:
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm. ICML 2023: 33657-33673
[i14]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-14451
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-14451
Bernardo Ávila Pires, Feryal M. P. Behbahani, Hubert Soyer, Kyriacos Nikiforou, Thomas Keck, Satinder Singh:
Hierarchical Reinforcement Learning in Complex 3D Environments. CoRR abs/2302.14451 (2023)
[i13]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-01486
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-01486
Clare Lyle, Zeyu Zheng, Evgenii Nikishin, Bernardo Ávila Pires, Razvan Pascanu, Will Dabney:
Understanding plasticity in neural networks. CoRR abs/2303.01486 (2023)
[i12]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18501
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18501
Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Rémi Munos, Bernardo Ávila Pires, Michal Valko:
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm. CoRR abs/2305.18501 (2023)
2022
[c9]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/GuoTPPATSCGTVMA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GuoTPPATSCGTVMA22
Zhaohan Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. NeurIPS 2022
[c8]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/TangMRPDB22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/TangMRPDB22
Yunhao Tang, Rémi Munos, Mark Rowland, Bernardo Ávila Pires, Will Dabney, Marc G. Bellemare:
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning. NeurIPS 2022
[i11]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-08332
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-08332
Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. CoRR abs/2206.08332 (2022)
[i10]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-07570
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-07570
Yunhao Tang, Mark Rowland, Rémi Munos, Bernardo Ávila Pires, Will Dabney, Marc G. Bellemare:
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning. CoRR abs/2207.07570 (2022)
[i9]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-03319
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-03319
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. CoRR abs/2212.03319 (2022)
2021
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2101-02055
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-02055
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu Thakoor, Bilal Piot, Bernardo Ávila Pires, Michal Valko, Thomas Mesnard, Tor Lattimore, Rémi Munos:
Geometric Entropic Exploration. CoRR abs/2101.02055 (2021)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-02274
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-02274
Pol Moreno, Edward Hughes, Kevin R. McKee, Bernardo Ávila Pires, Théophane Weber:
Neural Recursive Belief States in Multi-Agent Reinforcement Learning. CoRR abs/2102.02274 (2021)
2020
[c7]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/GuoPPGAMA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GuoPPGAMA20
Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. ICML 2020: 3875-3886
[c6]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/GrillSATRBDPGAP20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GrillSATRBDPGAP20
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. NeurIPS 2020
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2004-14646
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-14646
Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. CoRR abs/2004.14646 (2020)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2006-07733
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-07733
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning. CoRR abs/2006.07733 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1902-07685
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-07685
Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Jean-Bastien Grill, Florent Altché, Rémi Munos:
World Discovery Models. CoRR abs/1902.07685 (2019)
2018
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1811-06407
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-06407
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Toby Pohlen, Rémi Munos:
Neural Predictive Belief Representations. CoRR abs/1811.06407 (2018)
2016
[c5]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/colt/Pires16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/colt/Pires16
Bernardo Ávila Pires:
Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models. COLT 2016: 121-151
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/PiresS16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/PiresS16
Bernardo Ávila Pires, Csaba Szepesvári:
Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models. CoRR abs/1602.06346 (2016)
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/PiresS16a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/PiresS16a
Bernardo Ávila Pires, Csaba Szepesvári:
Multiclass Classification Calibration Functions. CoRR abs/1609.06385 (2016)
2015
[c4]
- view
  - electronic edition @ aaai.org (archived)
  - no references & citations available
- export record
  dblp key:
  - conf/aaai/PiresS15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/PiresS15
Bernardo Ávila Pires, Csaba Szepesvári:
Pathological Effects of Variance on Classification-Based Policy Iteration. AAAI Workshop: Learning for General Competency in Video Games 2015
2014
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/adprl/YaoSPZ14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/adprl/YaoSPZ14
Hengshuai Yao, Csaba Szepesvári, Bernardo Ávila Pires, Xinhua Zhang:
Pseudo-MDPs and factored linear action models. ADPRL 2014: 1-9
2013
[c2]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/PiresSG13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/PiresSG13
Bernardo Ávila Pires, Csaba Szepesvári, Mohammad Ghavamzadeh:
Cost-sensitive Multiclass Classification Risk Bounds. ICML (3) 2013: 1391-1399
2012
[c1]
- view
  - electronic edition @ icml.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/PiresS12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/PiresS12
Bernardo Ávila Pires, Csaba Szepesvári:
Statistical linear estimation with penalized estimators: an application to reinforcement learning. ICML 2012

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.