Search | arXiv e-print repository

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2305.17274 [pdf, other]

doi 10.1038/s42254-023-00655-3

How to verify the precision of density-functional-theory implementations via reproducible and universal workflows

Authors: Emanuele Bosoni, Louis Beal, Marnik Bercx, Peter Blaha, Stefan Blügel, Jens Bröder, Martin Callsen, Stefaan Cottenier, Augustin Degomme, Vladimir Dikan, Kristjan Eimre, Espen Flage-Larsen, Marco Fornari, Alberto Garcia, Luigi Genovese, Matteo Giantomassi, Sebastiaan P. Huber, Henning Janssen, Georg Kastlunger, Matthias Krack, Georg Kresse, Thomas D. Kühne, Kurt Lejaeghere, Georg K. H. Madsen, Martijn Marsman , et al. (20 additional authors not shown)

Abstract: In the past decades many density-functional theory methods and codes adopting periodic boundary conditions have been developed and are now extensively used in condensed matter physics and materials science research. Only in 2016, however, their precision (i.e., to which extent properties computed with different codes agree among each other) was systematically assessed on elemental crystals: a firs… ▽ More In the past decades many density-functional theory methods and codes adopting periodic boundary conditions have been developed and are now extensively used in condensed matter physics and materials science research. Only in 2016, however, their precision (i.e., to which extent properties computed with different codes agree among each other) was systematically assessed on elemental crystals: a first crucial step to evaluate the reliability of such computations. We discuss here general recommendations for verification studies aiming at further testing precision and transferability of density-functional-theory computational approaches and codes. We illustrate such recommendations using a greatly expanded protocol covering the whole periodic table from Z=1 to 96 and characterizing 10 prototypical cubic compounds for each element: 4 unaries and 6 oxides, spanning a wide range of coordination numbers and oxidation states. The primary outcome is a reference dataset of 960 equations of state cross-checked between two all-electron codes, then used to verify and improve nine pseudopotential-based approaches. Such effort is facilitated by deploying AiiDA common workflows that perform automatic input parameter selection, provide identical input/output interfaces across codes, and ensure full reproducibility. Finally, we discuss the extent to which the current results for total energies can be reused for different goals (e.g., obtaining formation energies). △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: Main text: 23 pages, 4 figures. Supplementary: 68 pages. Nature Review Physics 2023

Journal ref: Nat. Rev. Phys. 6, 45 (2024)

arXiv:2105.05063 [pdf, ps, other]

doi 10.1038/s41524-021-00594-6

Common workflows for computing material properties using different quantum engines

Authors: Sebastiaan P. Huber, Emanuele Bosoni, Marnik Bercx, Jens Bröder, Augustin Degomme, Vladimir Dikan, Kristjan Eimre, Espen Flage-Larsen, Alberto Garcia, Luigi Genovese, Dominik Gresch, Conrad Johnston, Guido Petretto, Samuel Poncé, Gian-Marco Rignanese, Christopher J. Sewell, Berend Smit, Vasily Tseplyaev, Martin Uhrin, Daniel Wortmann, Aliaksandr V. Yakutovich, Austin Zadoks, Pezhman Zarabadi-Poor, Bonan Zhu, Nicola Marzari , et al. (1 additional authors not shown)

Abstract: The prediction of material properties through electronic-structure simulations based on density-functional theory has become routinely common, thanks, in part, to the steady increase in the number and robustness of available simulation packages. This plurality of codes and methods aiming to solve similar problems is both a boon and a burden. While providing great opportunities for cross-verificati… ▽ More The prediction of material properties through electronic-structure simulations based on density-functional theory has become routinely common, thanks, in part, to the steady increase in the number and robustness of available simulation packages. This plurality of codes and methods aiming to solve similar problems is both a boon and a burden. While providing great opportunities for cross-verification, these packages adopt different methods, algorithms, and paradigms, making it challenging to choose, master, and efficiently use any one for a given task. Leveraging recent advances in managing reproducible scientific workflows, we demonstrate how developing common interfaces for workflows that automatically compute material properties can tackle the challenge mentioned above, greatly simplifying interoperability and cross-verification. We introduce design rules for reproducible and reusable code-agnostic workflow interfaces to compute well-defined material properties, which we implement for eleven different quantum engines and use to compute three different material properties. Each implementation encodes carefully selected simulation parameters and workflow logic, making the implementer's expertise of the quantum engine directly available to non-experts. Full provenance and reproducibility of the workflows is guaranteed through the use of the AiiDA infrastructure. All workflows are made available as open-source and come pre-installed with the Quantum Mobile virtual machine, making their use straightforward. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Journal ref: npj Comput Mater 7, 136 (2021)

Showing 1–3 of 3 results for author: Bröder, J