Quantum computing has demonstrated the potential to revolutionize our understanding of nuclear, a... more Quantum computing has demonstrated the potential to revolutionize our understanding of nuclear, atomic, and molecular structure by obtaining forefront solutions in non-relativistic quantum many-body theory. In this work, we show that quantum computing can be used to solve for the structure of hadrons, governed by stronglyinteracting relativistic quantum field theory. Following our previous work on light unflavored mesons as a relativistic bound-state problem within the nonperturbative Hamiltonian formalism, we present the numerical calculations on simulated quantum devices using the basis light-front quantization (BLFQ) approach. We implement and compare the variational quantum eigensolver (VQE) and the subspace-search variational quantum eigensolver (SSVQE) to find the low-lying mass spectrum of the light meson system and its corresponding light-front wave functions (LFWFs) via quantum simulations. Based on these LFWFs, we then evaluate the meson decay constants and parton distribu...
Quantum computing has demonstrated the potential to revolutionize our understanding of nuclear, a... more Quantum computing has demonstrated the potential to revolutionize our understanding of nuclear, atomic, and molecular structure by obtaining forefront solutions in non-relativistic quantum manybody theory. In this work, we show that quantum computing can be used to solve for the structure of hadrons, governed by strongly-interacting relativistic quantum field theory. Following our previous work on light unflavored mesons as a relativistic bound-state problem within the nonperturbative Hamiltonian formalism, we present the numerical calculations on simulated quantum devices using the basis light-front quantization (BLFQ) approach. We implement and compare the variational quantum eigensolver (VQE) and the subspace-search variational quantum eigensolver (SSVQE) to find the low-lying mass spectrum of the light meson system and its corresponding light-front wave functions (LFWFs) via various quantum backends. Based on these LFWFs, we evaluate the meson decay constants and parton distribu...
The Quantum Fourier Transform (QFT) grants competitive advantages, especially in resource usage a... more The Quantum Fourier Transform (QFT) grants competitive advantages, especially in resource usage and circuit approximation, for performing arithmetic operations on quantum computers, and offers a potential route towards a numerical quantum-computational paradigm. In this paper, we utilize efficient techniques to implement QFT-based integer addition and multiplications. These operations are fundamental to various quantum applications including Shor’s algorithm, weighted sum optimization problems in data processing and machine learning, and quantum algorithms requiring inner products. We carry out performance evaluations of these implementations based on IBM’s superconducting qubit architecture using different compatible noise models. We isolate the sensitivity of the component quantum circuits on both one-/two-qubit gate error rates, and the number of the arithmetic operands’ superposed integer states. We analyze performance, and identify the most effective approximation depths for qu...
a series of events dealing with logics, algebras, advanced computation techniques, specialized pr... more a series of events dealing with logics, algebras, advanced computation techniques, specialized programming languages, and tools for distributed computation. Mainly, the event targeted those aspects supporting context-oriented systems, adaptive systems, service computing, patterns and content-oriented features, temporal and ubiquitous aspects, and many facets of computational benchmarking. The conference had the following tracks: ï‚· Advanced computation techniques ï‚· Tools for distributed computation Similar to the previous edition, this event attracted excellent contributions and active participation from all over the world. We were very pleased to receive top quality contributions.
HPC-Bench is a general purpose tool to optimize benchmarking workflow for high performance comput... more HPC-Bench is a general purpose tool to optimize benchmarking workflow for high performance computing (HPC) to aid in the efficient evaluation of performance using multiple applications on an HPC machine with only a "click of a button". HPC-Bench allows multiple applications written in different languages, multiple parallel versions, multiple numbers of pro-cesses/threads to be evaluated. Performance results are put into a database, which is then queried for the desired performance data, and then the R statistical software package is used to generate the desired graphs and tables. The use of HPC-Bench is illustrated with complex applications that were run on the National Energy Research Scientific Computing Center's (NERSC) Edison Cray XC30 HPC computer.
Not only architectures (multi-core, GPU and FPGA accelerators) will become more heterogenous than... more Not only architectures (multi-core, GPU and FPGA accelerators) will become more heterogenous than they are already Also tools need diversification due to diverse application scenarios • Most abstract levels: Expert systems need ontology programming and appropriate tools • Less abstract level: SaaS (Software as a Service) needs virtualization and appropriate tools for developing service and client • More hardware-oriented level: Tools have to support optimization more than now Automatic transfer of code sequences in optimized structures May be autotining is an answer for that sophisticated task • Stronger hardware-orientated level (HPC applications): The performance to achieve is everything and therefore adequate tools are necessary • Hardware architecture level: Future Nanotechnology requires tools that support resiliency on different levels (analgoue, digital and system level)
The Fortran 2018 standard defines syntax and semantics to allow a parallel application to recover... more The Fortran 2018 standard defines syntax and semantics to allow a parallel application to recover from failed images (processes) during execution. This poster presents work to extend the GFortran compiler front end and OpenCoarrays library to support fault tolerant teams of images, enabling use of collective routines after an image failure. Disciplines Computer Sciences | Programming Languages and Compilers This poster is available at Iowa State University Digital Repository: https://lib.dr.iastate.edu/cs_conf/49 Example Fault-Tolerant Parallel Monte Carlo Pi Calculation ... do sample = 1, NSAMPLES call random_number(x); call random_number(y) if (hypot(x, y) <= 1) n = n + 1 end do if (this_image() == 1) n_copy = n do form team (1, team, stat=status) ! simulate image failure fail = size(failed_images()) < NFAIL & .and. this_image() == num_images() change team (team, stat=status) if (fail) fail image ! result undefined if image failure during ! co_sum(); use copy of n on image 1...
2018 IEEE International Conference on Cluster Computing (CLUSTER), 2018
Powerful high performance computing systems of the future are expected to have higher failure rat... more Powerful high performance computing systems of the future are expected to have higher failure rates than current systems. As a result, HPC applications running on such future systems are more likely to encounter a system failure than on today's machines. Application fault tolerance is therefore becoming more important to avoid costly waste of resources associated with rerunning failed applications. The MPI 3.1 standard does not address the issue of MPI process failures. Checkpoint/restart is commonly used to add fault tolerance to MPI applications. However, there can be complicated issues impacting an MPI application's ability to correctly and efficiently write checkpoint files, particularly if Fortran I/O statements are used. Moreover, it may be inefficient restart a large number MPI processes from a checkpoint. Several MPI fault tolerance libraries, such as ULFM, are being developed to enabl MPI programs to recover from MPI process failures. This can circumvent much of the...
epiSNP is a program for identifying pairwise single nucleotide polymorphism (SNP) interactions (e... more epiSNP is a program for identifying pairwise single nucleotide polymorphism (SNP) interactions (epistasis) that affect quantitative traits in genome-wide association studies (GWAS). A parallel MPI version (EPISNPmpi) was created in 2008 to address this computationally-expensive analysis on data sets with many quantitative traits and markers. However, the explosion in genome sequencing will lead to the creation of large-scale data sets that will overwhelm EPISNPmpi's ability to compute results in a reasonable amount of time. Thus, epiSNP was rewritten to efficiently handle these large data sets. This was accomplished by performing serial optimizations, improving MPI load balancing, and introducing parallel OpenMP directives to further enhance load balancing and allow execution on the Intel Xeon Phi coprocessor (MIC). These additions resulted in new scalable versions of epiSNP using MPI, MPI+OpenMP, and MPI+OpenMP with one or two MICs. For a large 774,660 SNP data set with 1,634 individuals, the runtime on 126 nodes of TACC's Stampede Supercomputer was 10.61 minutes without MICs, and 5.13 minutes with 2 MICs. This translated to speedups over EPISNPmpi of 17X without MICs, and 36X with 2 MICs.
Quantum computing has demonstrated the potential to revolutionize our understanding of nuclear, a... more Quantum computing has demonstrated the potential to revolutionize our understanding of nuclear, atomic, and molecular structure by obtaining forefront solutions in non-relativistic quantum many-body theory. In this work, we show that quantum computing can be used to solve for the structure of hadrons, governed by stronglyinteracting relativistic quantum field theory. Following our previous work on light unflavored mesons as a relativistic bound-state problem within the nonperturbative Hamiltonian formalism, we present the numerical calculations on simulated quantum devices using the basis light-front quantization (BLFQ) approach. We implement and compare the variational quantum eigensolver (VQE) and the subspace-search variational quantum eigensolver (SSVQE) to find the low-lying mass spectrum of the light meson system and its corresponding light-front wave functions (LFWFs) via quantum simulations. Based on these LFWFs, we then evaluate the meson decay constants and parton distribu...
Quantum computing has demonstrated the potential to revolutionize our understanding of nuclear, a... more Quantum computing has demonstrated the potential to revolutionize our understanding of nuclear, atomic, and molecular structure by obtaining forefront solutions in non-relativistic quantum manybody theory. In this work, we show that quantum computing can be used to solve for the structure of hadrons, governed by strongly-interacting relativistic quantum field theory. Following our previous work on light unflavored mesons as a relativistic bound-state problem within the nonperturbative Hamiltonian formalism, we present the numerical calculations on simulated quantum devices using the basis light-front quantization (BLFQ) approach. We implement and compare the variational quantum eigensolver (VQE) and the subspace-search variational quantum eigensolver (SSVQE) to find the low-lying mass spectrum of the light meson system and its corresponding light-front wave functions (LFWFs) via various quantum backends. Based on these LFWFs, we evaluate the meson decay constants and parton distribu...
The Quantum Fourier Transform (QFT) grants competitive advantages, especially in resource usage a... more The Quantum Fourier Transform (QFT) grants competitive advantages, especially in resource usage and circuit approximation, for performing arithmetic operations on quantum computers, and offers a potential route towards a numerical quantum-computational paradigm. In this paper, we utilize efficient techniques to implement QFT-based integer addition and multiplications. These operations are fundamental to various quantum applications including Shor’s algorithm, weighted sum optimization problems in data processing and machine learning, and quantum algorithms requiring inner products. We carry out performance evaluations of these implementations based on IBM’s superconducting qubit architecture using different compatible noise models. We isolate the sensitivity of the component quantum circuits on both one-/two-qubit gate error rates, and the number of the arithmetic operands’ superposed integer states. We analyze performance, and identify the most effective approximation depths for qu...
a series of events dealing with logics, algebras, advanced computation techniques, specialized pr... more a series of events dealing with logics, algebras, advanced computation techniques, specialized programming languages, and tools for distributed computation. Mainly, the event targeted those aspects supporting context-oriented systems, adaptive systems, service computing, patterns and content-oriented features, temporal and ubiquitous aspects, and many facets of computational benchmarking. The conference had the following tracks: ï‚· Advanced computation techniques ï‚· Tools for distributed computation Similar to the previous edition, this event attracted excellent contributions and active participation from all over the world. We were very pleased to receive top quality contributions.
HPC-Bench is a general purpose tool to optimize benchmarking workflow for high performance comput... more HPC-Bench is a general purpose tool to optimize benchmarking workflow for high performance computing (HPC) to aid in the efficient evaluation of performance using multiple applications on an HPC machine with only a "click of a button". HPC-Bench allows multiple applications written in different languages, multiple parallel versions, multiple numbers of pro-cesses/threads to be evaluated. Performance results are put into a database, which is then queried for the desired performance data, and then the R statistical software package is used to generate the desired graphs and tables. The use of HPC-Bench is illustrated with complex applications that were run on the National Energy Research Scientific Computing Center's (NERSC) Edison Cray XC30 HPC computer.
Not only architectures (multi-core, GPU and FPGA accelerators) will become more heterogenous than... more Not only architectures (multi-core, GPU and FPGA accelerators) will become more heterogenous than they are already Also tools need diversification due to diverse application scenarios • Most abstract levels: Expert systems need ontology programming and appropriate tools • Less abstract level: SaaS (Software as a Service) needs virtualization and appropriate tools for developing service and client • More hardware-oriented level: Tools have to support optimization more than now Automatic transfer of code sequences in optimized structures May be autotining is an answer for that sophisticated task • Stronger hardware-orientated level (HPC applications): The performance to achieve is everything and therefore adequate tools are necessary • Hardware architecture level: Future Nanotechnology requires tools that support resiliency on different levels (analgoue, digital and system level)
The Fortran 2018 standard defines syntax and semantics to allow a parallel application to recover... more The Fortran 2018 standard defines syntax and semantics to allow a parallel application to recover from failed images (processes) during execution. This poster presents work to extend the GFortran compiler front end and OpenCoarrays library to support fault tolerant teams of images, enabling use of collective routines after an image failure. Disciplines Computer Sciences | Programming Languages and Compilers This poster is available at Iowa State University Digital Repository: https://lib.dr.iastate.edu/cs_conf/49 Example Fault-Tolerant Parallel Monte Carlo Pi Calculation ... do sample = 1, NSAMPLES call random_number(x); call random_number(y) if (hypot(x, y) <= 1) n = n + 1 end do if (this_image() == 1) n_copy = n do form team (1, team, stat=status) ! simulate image failure fail = size(failed_images()) < NFAIL & .and. this_image() == num_images() change team (team, stat=status) if (fail) fail image ! result undefined if image failure during ! co_sum(); use copy of n on image 1...
2018 IEEE International Conference on Cluster Computing (CLUSTER), 2018
Powerful high performance computing systems of the future are expected to have higher failure rat... more Powerful high performance computing systems of the future are expected to have higher failure rates than current systems. As a result, HPC applications running on such future systems are more likely to encounter a system failure than on today's machines. Application fault tolerance is therefore becoming more important to avoid costly waste of resources associated with rerunning failed applications. The MPI 3.1 standard does not address the issue of MPI process failures. Checkpoint/restart is commonly used to add fault tolerance to MPI applications. However, there can be complicated issues impacting an MPI application's ability to correctly and efficiently write checkpoint files, particularly if Fortran I/O statements are used. Moreover, it may be inefficient restart a large number MPI processes from a checkpoint. Several MPI fault tolerance libraries, such as ULFM, are being developed to enabl MPI programs to recover from MPI process failures. This can circumvent much of the...
epiSNP is a program for identifying pairwise single nucleotide polymorphism (SNP) interactions (e... more epiSNP is a program for identifying pairwise single nucleotide polymorphism (SNP) interactions (epistasis) that affect quantitative traits in genome-wide association studies (GWAS). A parallel MPI version (EPISNPmpi) was created in 2008 to address this computationally-expensive analysis on data sets with many quantitative traits and markers. However, the explosion in genome sequencing will lead to the creation of large-scale data sets that will overwhelm EPISNPmpi's ability to compute results in a reasonable amount of time. Thus, epiSNP was rewritten to efficiently handle these large data sets. This was accomplished by performing serial optimizations, improving MPI load balancing, and introducing parallel OpenMP directives to further enhance load balancing and allow execution on the Intel Xeon Phi coprocessor (MIC). These additions resulted in new scalable versions of epiSNP using MPI, MPI+OpenMP, and MPI+OpenMP with one or two MICs. For a large 774,660 SNP data set with 1,634 individuals, the runtime on 126 nodes of TACC's Stampede Supercomputer was 10.61 minutes without MICs, and 5.13 minutes with 2 MICs. This translated to speedups over EPISNPmpi of 17X without MICs, and 36X with 2 MICs.
Uploads
Papers by Glenn Luecke