Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–17 of 17 results for author: Antonoglou, I

.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  3. arXiv:2106.04615  [pdf, other

    cs.LG cs.AI stat.ML

    Vector Quantized Models for Planning

    Authors: Sherjil Ozair, Yazhe Li, Ali Razavi, Ioannis Antonoglou, Aäron van den Oord, Oriol Vinyals

    Abstract: Recent developments in the field of model-based RL have proven successful in a range of environments, especially ones where planning is essential. However, such successes have been limited to deterministic fully-observed environments. We present a new approach that handles stochastic and partially-observable environments. Our key insight is to use discrete autoencoders to capture the multiple poss… ▽ More

    Submitted 10 June, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: ICML 2021

  4. arXiv:2104.06303  [pdf, other

    cs.LG

    Learning and Planning in Complex Action Spaces

    Authors: Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Mohammadamin Barekatain, Simon Schmitt, David Silver

    Abstract: Many important real-world problems have action spaces that are high-dimensional, continuous or both, making full enumeration of all possible actions infeasible. Instead, only small subsets of actions can be sampled for the purpose of policy evaluation and improvement. In this paper, we propose a general framework to reason in a principled way about policy evaluation and improvement over such sampl… ▽ More

    Submitted 13 April, 2021; originally announced April 2021.

  5. arXiv:2104.06294  [pdf, other

    cs.LG

    Online and Offline Reinforcement Learning by Planning with a Learned Model

    Authors: Julian Schrittwieser, Thomas Hubert, Amol Mandhane, Mohammadamin Barekatain, Ioannis Antonoglou, David Silver

    Abstract: Learning efficiently from small amounts of data has long been the focus of model-based reinforcement learning, both for the online case when interacting with the environment and the offline case when learning from a fixed dataset. However, to date no single unified algorithm could demonstrate state-of-the-art results in both settings. In this work, we describe the Reanalyse algorithm which uses mo… ▽ More

    Submitted 13 April, 2021; originally announced April 2021.

  6. arXiv:2104.05336  [pdf, other

    cs.CL cs.LG

    Machine Translation Decoding beyond Beam Search

    Authors: Rémi Leblond, Jean-Baptiste Alayrac, Laurent Sifre, Miruna Pislar, Jean-Baptiste Lespiau, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals

    Abstract: Beam search is the go-to method for decoding auto-regressive machine translation models. While it yields consistent improvements in terms of BLEU, it is only concerned with finding outputs with high model likelihood, and is thus agnostic to whatever end metric or score practitioners care about. Our aim is to establish whether beam search can be replaced by a more powerful metric-driven search tech… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: 23 pages

  7. arXiv:2007.12509  [pdf, other

    cs.LG stat.ML

    Monte-Carlo Tree Search as Regularized Policy Optimization

    Authors: Jean-Bastien Grill, Florent Altché, Yunhao Tang, Thomas Hubert, Michal Valko, Ioannis Antonoglou, Rémi Munos

    Abstract: The combination of Monte-Carlo tree search (MCTS) with deep reinforcement learning has led to significant advances in artificial intelligence. However, AlphaZero, the current state-of-the-art MCTS algorithm, still relies on handcrafted heuristics that are only partially understood. In this paper, we show that AlphaZero's search heuristics, along with other common ones such as UCT, are an approxima… ▽ More

    Submitted 24 July, 2020; originally announced July 2020.

    Comments: Accepted to International Conference on Machine Learning (ICML), 2020

  8. arXiv:2002.02836  [pdf, other

    cs.LG cs.AI stat.ML

    Causally Correct Partial Models for Reinforcement Learning

    Authors: Danilo J. Rezende, Ivo Danihelka, George Papamakarios, Nan Rosemary Ke, Ray Jiang, Theophane Weber, Karol Gregor, Hamza Merzic, Fabio Viola, Jane Wang, Jovana Mitrovic, Frederic Besse, Ioannis Antonoglou, Lars Buesing

    Abstract: In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions. However, jointly modeling future observations can be computationally expensive or even intractable if the observations are high-dimensional (e.g. images). For this reason, previous works have considered partial models, which model only part of the observation. In this pa… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

  9. Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

    Authors: Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver

    Abstract: Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess and Go, where a perfect simulator is available. However, in real-world problems the dynamics governing the environment are often complex and unknown. In this work we present the… ▽ More

    Submitted 21 February, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

  10. arXiv:1812.06855  [pdf, other

    cs.LG cs.AI stat.ML

    Bayesian Optimization in AlphaGo

    Authors: Yutian Chen, Aja Huang, Ziyu Wang, Ioannis Antonoglou, Julian Schrittwieser, David Silver, Nando de Freitas

    Abstract: During the development of AlphaGo, its many hyper-parameters were tuned with Bayesian optimization multiple times. This automatic tuning process resulted in substantial improvements in playing strength. For example, prior to the match with Lee Sedol, we tuned the latest AlphaGo agent and this improved its win-rate from 50% to 66.5% in self-play games. This tuned version was deployed in the final m… ▽ More

    Submitted 17 December, 2018; originally announced December 2018.

  11. arXiv:1802.04697  [pdf, other

    cs.AI cs.LG stat.ML

    Learning to Search with MCTSnets

    Authors: Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver

    Abstract: Planning problems are among the most important and well-studied problems in artificial intelligence. They are most typically solved by tree search algorithms that simulate ahead into the future, evaluate future states, and back-up those evaluations to the root of a search tree. Among these algorithms, Monte-Carlo tree search (MCTS) is one of the most general, powerful and widely used. A typical im… ▽ More

    Submitted 17 July, 2018; v1 submitted 13 February, 2018; originally announced February 2018.

    Comments: ICML 2018 (camera-ready version)

  12. arXiv:1712.01815  [pdf, other

    cs.AI cs.LG

    Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

    Authors: David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis

    Abstract: The game of chess is the most widely-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. In contrast, the AlphaGo Zero program recently achieved superhuman performance in the game… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

  13. arXiv:1611.05740  [pdf, other

    cs.AI

    Fast Non-Parametric Tests of Relative Dependency and Similarity

    Authors: Wacha Bounliphone, Eugene Belilovsky, Arthur Tenenhaus, Ioannis Antonoglou, Arthur Gretton, Matthew B. Blashcko

    Abstract: We introduce two novel non-parametric statistical hypothesis tests. The first test, called the relative test of dependency, enables us to determine whether one source variable is significantly more dependent on a first target variable or a second. Dependence is measured via the Hilbert-Schmidt Independence Criterion (HSIC). The second test, called the relative test of similarity, is use to determi… ▽ More

    Submitted 17 November, 2016; originally announced November 2016.

  14. arXiv:1511.05952  [pdf, other

    cs.LG

    Prioritized Experience Replay

    Authors: Tom Schaul, John Quan, Ioannis Antonoglou, David Silver

    Abstract: Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled from a replay memory. However, this approach simply replays transitions at the same frequency that they were originally experienced, regardless of their significance. In this paper we develop a framework for prioritizing experience,… ▽ More

    Submitted 25 February, 2016; v1 submitted 18 November, 2015; originally announced November 2015.

    Comments: Published at ICLR 2016

  15. arXiv:1511.04581  [pdf, other

    stat.ML cs.LG

    A Test of Relative Similarity For Model Selection in Generative Models

    Authors: Wacha Bounliphone, Eugene Belilovsky, Matthew B. Blaschko, Ioannis Antonoglou, Arthur Gretton

    Abstract: Probabilistic generative models provide a powerful framework for representing data that avoids the expense of manual annotation typically needed by discriminative approaches. Model selection in this generative setting can be challenging, however, particularly when likelihoods are not easily accessible. To address this issue, we introduce a statistical test of relative similarity, which is used to… ▽ More

    Submitted 15 February, 2016; v1 submitted 14 November, 2015; originally announced November 2015.

    Comments: International Conference on Learning Representations 2016

  16. arXiv:1312.6055  [pdf, other

    cs.LG

    Unit Tests for Stochastic Optimization

    Authors: Tom Schaul, Ioannis Antonoglou, David Silver

    Abstract: Optimization by stochastic gradient descent is an important component of many large-scale machine learning algorithms. A wide variety of such optimization algorithms have been devised; however, it is unclear whether these algorithms are robust and widely applicable across many different optimization landscapes. In this paper we develop a collection of unit tests for stochastic optimization. Each u… ▽ More

    Submitted 25 February, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

    Comments: Final submission to ICLR 2014 (revised according to reviews, additional results added)

  17. arXiv:1312.5602  [pdf, other

    cs.LG

    Playing Atari with Deep Reinforcement Learning

    Authors: Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller

    Abstract: We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learning E… ▽ More

    Submitted 19 December, 2013; originally announced December 2013.

    Comments: NIPS Deep Learning Workshop 2013