-
Efficient Mixed-Precision Matrix Factorization of the Inverse Overlap Matrix in Electronic Structure Calculations with AI-Hardware and GPUs
Authors:
Adela Habib,
Joshua Finkelstein,
Anders M. N. Niklasson
Abstract:
In recent years, a new kind of accelerated hardware has gained popularity in the Artificial Intelligence (AI) and Machine Learning (ML) communities which enables extremely high-performance tensor contractions in reduced precision for deep neural network calculations. In this article, we exploit Nvidia Tensor cores, a prototypical example of such AI/ML hardware, to develop a mixed precision approac…
▽ More
In recent years, a new kind of accelerated hardware has gained popularity in the Artificial Intelligence (AI) and Machine Learning (ML) communities which enables extremely high-performance tensor contractions in reduced precision for deep neural network calculations. In this article, we exploit Nvidia Tensor cores, a prototypical example of such AI/ML hardware, to develop a mixed precision approach for computing a dense matrix factorization of the inverse overlap matrix in electronic structure theory, $S^{-1}$. This factorization of $S^{-1}$, written as $ZZ^T=S^{-1}$, is used to transform the general matrix eigenvalue problem into a standard matrix eigenvalue problem. Here we present a mixed precision iterative refinement algorithm where $Z$ is given recursively using matrix-matrix multiplications and can be computed with high performance on Tensor cores. To understand the performance and accuracy of Tensor cores, comparisons are made to GPU-only implementations in single and double precision. Additionally, we propose a non-parametric stopping criteria which is robust in the face of lower precision floating point operations. The algorithm is particularly useful when we have a good initial guess to $Z$, for example, from previous time steps in quantum-mechanical molecular dynamics simulations or from a previous iteration in a geometry optimization.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
A Methodology to Generate Crystal-based Molecular Structures for Atomistic Simulations
Authors:
Christian F. A. Negre,
Andrew Alvarado,
Himanshu Singh,
Joshua Finkelstein,
Enrique Martinez,
Romain Perriot
Abstract:
We propose a systematic method to construct crystal-based molecular structures often needed as input for computational chemistry studies. These structures include crystal ``slabs" with periodic boundary conditions (PBCs) and non-periodic solids such as Wulff structures. We also introduce a method to build crystal slabs with orthogonal PBC vectors. These methods are integrated into our code, Los Al…
▽ More
We propose a systematic method to construct crystal-based molecular structures often needed as input for computational chemistry studies. These structures include crystal ``slabs" with periodic boundary conditions (PBCs) and non-periodic solids such as Wulff structures. We also introduce a method to build crystal slabs with orthogonal PBC vectors. These methods are integrated into our code, Los Alamos Crystal Cut (LCC), which is open source and thus fully available to the community. Examples showing the use of these methods are given throughout the manuscript.
△ Less
Submitted 11 October, 2022; v1 submitted 3 October, 2022;
originally announced October 2022.
-
Bringing discrete-time Langevin splitting methods into agreement with thermodynamics
Authors:
Joshua Finkelstein,
Chungho Cheng,
Giacomo Fiorin,
Benjamin Seibold,
Niels Grønbech-Jensen
Abstract:
In light of the recently published complete set of statistically correct Gronbech-Jensen (GJ) methods for discrete-time thermodynamics, we revise a differential operator splitting method for the Langevin equation in order to comply with the basic GJ thermodynamic sampling features, namely the Boltzmann distribution and Einstein diffusion, in linear systems. This revision, which is based on the int…
▽ More
In light of the recently published complete set of statistically correct Gronbech-Jensen (GJ) methods for discrete-time thermodynamics, we revise a differential operator splitting method for the Langevin equation in order to comply with the basic GJ thermodynamic sampling features, namely the Boltzmann distribution and Einstein diffusion, in linear systems. This revision, which is based on the introduction of time scaling along with flexibility of a discrete-time velocity attenuation parameter, provides a direct link between the ABO splitting formalism and the GJ methods. This link brings about the conclusion that any GJ method has at least weak second order accuracy in the applied time step. It further helps identify a novel half-step velocity, which simultaneously produces both correct kinetic statistics and correct transport measures for any of the statistically sound GJ methods. Explicit algorithmic expressions are given for the integration of the new half-step velocity into the GJ set of methods. Numerical simulations, including quantum-based molecular dynamics (QMD) using the QMD suite LATTE, highlight the discussed properties of the algorithms as well as exhibit the direct application of robust, time step independent stochastic integrators to quantum-based molecular dynamics.
△ Less
Submitted 21 October, 2021; v1 submitted 7 August, 2021;
originally announced August 2021.
-
Mixed Precision Fermi-Operator Expansion on Tensor Cores From a Machine Learning Perspective
Authors:
Joshua Finkelstein,
Justin Smith,
Susan M. Mniszewski,
Kipton Barros,
Christian F. A. Negre,
Emanuel H. Rubensson,
Anders M. N. Niklasson
Abstract:
We present a second-order recursive Fermi-operator expansion scheme using mixed precision floating point operations to perform electronic structure calculations using tensor core units. A performance of over 100 teraFLOPs is achieved for half-precision floating point operations on Nvidia's A100 tensor core units. The second-order recursive Fermi-operator scheme is formulated in terms of a generali…
▽ More
We present a second-order recursive Fermi-operator expansion scheme using mixed precision floating point operations to perform electronic structure calculations using tensor core units. A performance of over 100 teraFLOPs is achieved for half-precision floating point operations on Nvidia's A100 tensor core units. The second-order recursive Fermi-operator scheme is formulated in terms of a generalized, differentiable deep neural network structure, which solves the quantum mechanical electronic structure problem. We demonstrate how this network can be accelerated by optimizing the weight and bias values to substantially reduce the number of layers required for convergence. We also show how this machine learning approach can be used to optimize the coefficients of the recursive Fermi-operator expansion to accurately represent fractional occupation numbers of the electronic states at finite temperatures.
△ Less
Submitted 16 January, 2021;
originally announced January 2021.
-
The Challenge of Stochastic Størmer-Verlet Thermostats Generating Correct Statistics
Authors:
Joshua Finkelstein,
Chungho Cheng,
Giacomo Fiorin,
Benjamin Seibold,
Niels Grønbech-Jensen
Abstract:
In light of the recently developed complete GJ set of single random variable stochastic, discrete-time Størmer-Verlet algorithms for statistically accurate simulations of Langevin equations, we investigate two outstanding questions: 1) Are there any algorithmic or statistical benefits from including multiple random variables per time-step, and 2) are there objective reasons for using one or more m…
▽ More
In light of the recently developed complete GJ set of single random variable stochastic, discrete-time Størmer-Verlet algorithms for statistically accurate simulations of Langevin equations, we investigate two outstanding questions: 1) Are there any algorithmic or statistical benefits from including multiple random variables per time-step, and 2) are there objective reasons for using one or more methods from the available set of statistically correct algorithms? To address the first question, we assume a general form for the discrete-time equations with two random variables and then follow the systematic, brute-force GJ methodology by enforcing correct thermodynamics in linear systems. It is concluded that correct configurational Boltzmann sampling of a particle in a harmonic potential implies correct configurational free-particle diffusion, and that these requirements only can be accomplished if the two random variables per time step are identical. We consequently submit that the GJ set represents all possible stochastic Størmer-Verlet methods that can reproduce time-step-independent statistics of linear systems. The second question is thus addressed within the GJ set. Based in part on numerical simulations of complex molecular systems, and in part on analytic scaling of time, we analyze the apparent difference in stability between different methods. We attribute this difference to the inherent time scaling in each method, and suggest that this scaling may lead to inconsistencies in the interpretation of dynamical and statistical simulation results. We therefore suggest that the method with the least inherent time-scaling, the GJ-I/GJF-2GJ method, be preferred for statistical applications where spurious rescaling of time is undesirable.
△ Less
Submitted 19 June, 2020;
originally announced June 2020.
-
Pressure-Induced Large Volume Collapse, Plane-to-Chain, Insulator to Metal Transition in CaMn$_2$Bi$_2$
Authors:
Xin Gui,
Gregory J. Finkelstein,
Keyu Chen,
Tommy Yong,
Przemyslaw Dera,
Jinguang Cheng,
Weiwei Xie
Abstract:
In-situ high pressure single crystal X-ray diffraction study reveals that the quantum material CaMn$_2$Bi$_2$ undergoes a unique plane to chain structural transition between 2 and 3 GPa, accompanied by a large volume collapse. CaMn2Bi2 displays a new structure type above 2.3 GPa, with the puckered Mn honeycomb lattice of the trigonal ambient-pressure structure converting to one-dimensional (1D) zi…
▽ More
In-situ high pressure single crystal X-ray diffraction study reveals that the quantum material CaMn$_2$Bi$_2$ undergoes a unique plane to chain structural transition between 2 and 3 GPa, accompanied by a large volume collapse. CaMn2Bi2 displays a new structure type above 2.3 GPa, with the puckered Mn honeycomb lattice of the trigonal ambient-pressure structure converting to one-dimensional (1D) zigzag chains in the high-pressure monoclinic structure. Single crystal measurements reveal that the pressure-induced structural transformation is accompanied by a dramatic two order of magnitude drop of resistivity; although the ambient pressure phase displays semiconducting behavior at low temperatures, metallic temperature dependent resistivity is observed for the high pressure phase, as, surprisingly, are two resistivity anomalies with opposite pressure dependences. Based on the electronic structure calculations, we hypothesized that the newly emerged electronic state under high pressure is associated with a Fermi surface instability of the quasi-1D Mn chains, while we infer that the other is a magnetic transition. Assessment of the total energies for hypothetical magnetic structures for high pressure CaMn$_2$Bi$_2$ indicates that ferrimagnetism is thermodynamically favored.
△ Less
Submitted 16 July, 2019;
originally announced July 2019.
-
Comparison of Modern Langevin Integrators for Simulations of Coarse-Grained Polymer Melts
Authors:
Joshua Finkelstein,
Giacomo Fiorin,
Benjamin Seibold
Abstract:
For a wide range of phenomena, current computational ability does not always allow for fully atomistic simulations of high-dimensional molecular systems to reach time scales of interest. Coarse-graining (CG) is an established approach to alleviate the impact of computational limits while retaining the same algorithms used in atomistic simulations. It is of importance to understand how algorithms s…
▽ More
For a wide range of phenomena, current computational ability does not always allow for fully atomistic simulations of high-dimensional molecular systems to reach time scales of interest. Coarse-graining (CG) is an established approach to alleviate the impact of computational limits while retaining the same algorithms used in atomistic simulations. It is of importance to understand how algorithms such as Langevin integrators perform on non-trivial CG molecular systems, and in particular how large of an integration time step can be used without introducing unacceptable amounts of error into averaged quantities of interest. To investigate this, we examined three different Langevin integrators on a CG polymer melt: the recently developed BAOAB method by Leimkuhler and Matthews, the Gronbech-Jensen and Farago method, or G-JF, and the frequently used Brunger-Brooks-Karplus integrator, also known as BBK. We compute and analyze key statistical properties for each. Our results indicate that the three integrators perform similarly when using a small friction parameter; however, outside of this regime the use of large integration steps produces significant deviations from the predicted diffusivity and steady-state distributions for all integration methods examined with the exception of G-JF.
△ Less
Submitted 31 March, 2019;
originally announced April 2019.