. User interfaces represent a doorknob to computer applications and directlyrelate to the usabili... more . User interfaces represent a doorknob to computer applications and directlyrelate to the usability of a given application. The desktop WIMP interfacehas been developed and refined over the past 15 years and works well for thedesktop environment. The question is, What does one do in the case of today'sthree-dimensional immersive environments? Does one translate the desktop toolsto the 3D space,
Journal of Parallel and Distributed Computing, 2015
ABSTRACT As high performance computing (HPC) continues to grow in scale and complexity, energy be... more ABSTRACT As high performance computing (HPC) continues to grow in scale and complexity, energy becomes a critical constraint in the race to exascale computing. The days of “performance at all cost” are coming to an end. While performance is still a major objective, future HPC will have to deliver desired performance under the energy constraint. Among various power management methods, power capping is a widely used approach. Unfortunately, the impact of power capping on system performance, user jobs, and power-performance efficiency are not well studied due to many interfering factors imposed by system workload and configurations. To fully understand power management in extreme scale systems with a fixed power budget, we introduce a power-performance modeling tool named PuPPET (Power Performance PETri net). Unlike the traditional performance modeling approaches such as analytical methods or trace-based simulators, we explore a new approach–colored Petri nets–for the design of PuPPET. PuPPET is fast and extensible for navigating through different configurations. More importantly, it can scale to hundreds of thousands of processor cores and at the same time provide high levels of modeling accuracy. We validate PuPPET by using system traces (i.e., workload log and power data) collected from the production 48-rack IBM Blue Gene/Q supercomputer at Argonne National Laboratory. Our trace-based validation demonstrates that PuPPET is capable of modeling the dynamic execution of parallel jobs on the machine by providing an accurate approximation of energy consumption. In addition, we present two case studies of using PuPPET to study power-performance tradeoffs on petascale systems.
Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion - SC '11 Companion, 2011
ABSTRACT Accurately modeling many physical and biological systems requires simulating at multiple... more ABSTRACT Accurately modeling many physical and biological systems requires simulating at multiple scales. This results in large and heterogeneous data sets on vastly differing scales, both physical and temporal. Here we look specifically at blood flow in a patient-specific cerebrovasculature with a brain aneurysm, and analyze the interaction of platelets with the arterial walls that lead to thrombus formation.
Advanced collaborative environments are one of the most important tools for interacting with coll... more Advanced collaborative environments are one of the most important tools for interacting with colleagues distributed around the world. However, heterogeneous characteristics such as network transfer rates, computational abilities, and hierarchical systems make the seamless integration of distributed resources a challenge. This paper proposes the design of two network services, Collaborative Environment Network Service Architecture (CENSA) and Infrastructure (CENSI), that embed network services into various systems intelligently and elastically and support seamless advanced collaborative environments. We present a multilayered model for their current utilization and future development. We describe various network services and discuss some open issues.
: We describe the use of the CAVE virtual reality visualization environment as an aid to the desi... more : We describe the use of the CAVE virtual reality visualization environment as an aid to the design of accelerator magnets. We have modeled an elliptical multipole wiggler magnet being designed for use at the Advanced Photon Source at Argonne National Laboratory. The CAVE environment allows us to explore and interact with the 3-D visualization of the magnet. Capabilities include changing the number of periods of the magnet displayed, changing the icons used for displaying the magnetic field, and changing the current in the electromagnet and observing the effect on the magnetic field and particle beam trajectory through the field. This work was supported by the Office of Scientific Computing, U.S. Department of Energy, under Contract W-31-109Eng -38. y This work was supported by the Office of Basic Energy Science, U.S. Department of Energy, under Contract W-31-109Eng -38. 1 Introduction Electromagnetic field analysis and design are more difficult in three dimensions than in two, b...
One of the major problems in three-dimensional (3-D) electromagnetic field computation is visuali... more One of the major problems in three-dimensional (3-D) electromagnetic field computation is visualizing the calculated field. Virtual reality techniques can be used as an aid to this process by providing multiple viewpoints, allowing immersion within the field, and taking advantage of the human ability to process 3-D spatial information. In this paper we present an example of 3-D electromagnetic field visualization in the CAVE virtual-reality environment. 1 Introduction Electromagnetic field analysis and design are more difficult in three dimensions than in two. Not only is the mathematics more complex (multiple-valued scalar potentials, gauge conditions on vector potentials), but so are the visualization aspects. The complexity arises from the greater amount of data (more mesh points, more field components per mesh point) and the desire to view the computational mesh and electromagnetic field calculations together. Scientific visualization of three-dimensional (3-D) vector fields, su...
One of the major problems in three-dimensional (3-D) electromagnetic field computation is visuali... more One of the major problems in three-dimensional (3-D) electromagnetic field computation is visualizing the calculated field. Virtual reality techniques can be used as an aid to this process by providing multiple viewpoints, allowing immersion within the field, and taking advantage of the human ability to process 3-D spatial information. In this paper we presentan example of 3-D electromagnetic field visualization in the CAVE virtual-realityenvironment. 1 Introduction Electromagnetic field analysis and design are more difficult in three dimensions than in two. Not only is the mathematics more complex (multiple-valued scalar potentials, gauge conditions on vector potentials), but so are the visualization aspects. The complexity arises from the greater amount of data (more mesh points, more field components per mesh point) and the desire to view the computational mesh and electromagnetic field calculations together. Scientific visualization of three-dimensional (3-D) vector fields, ...
2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013
ABSTRACT The Argonne Leadership Computing Facility (ALCF) is home to Mira, a 10 PF Blue Gene/Q (B... more ABSTRACT The Argonne Leadership Computing Facility (ALCF) is home to Mira, a 10 PF Blue Gene/Q (BG/Q) system. The BG/Q system is the third generation in Blue Gene architecture from IBM and like its predecessors combines system-onchip technology with a proprietary interconnect (5-D torus). Each compute node has 16 augmented PowerPC A2 processor cores with support for simultaneous multithreading, 4-wide double precision SIMD, and different data prefetching mechanisms. Mira offers several new opportunities for tuning and scaling scientific applications. This paper discusses our early experience with a subset of micro-benchmarks, MPI benchmarks, and a variety of science and engineering applications running at ALCF. Both performance and power are studied and results on BG/Q is compared with its predecessor BG/P. Several lessons gleaned from tuning applications on the BG/Q architecture for better performance and scalability are shared.
Most tools and algorithms of high-throughput bioinformatics data analysis are CPU-intensive, requ... more Most tools and algorithms of high-throughput bioinformatics data analysis are CPU-intensive, requiring high-end computing resources such as the TeraGrid. To facilitate the usage of the TeraGrid resources by the Life Science community for computing and data management, we have developed an integrated cyber computational environment named Open Life Science Gateway (OLSGW). OLSGW hides the complexity of accessing the distributed TeraGrid resources from biologists and bio-informaticians, and provides them with a user friendly web-service based interface. Since the user interface of OLSGW has the similar look and feel to bioinformatics web environments that have been available to the Life Science community for years, biologists can easily use this gateway to execute their analysis programs and compose computational workflow scripts without extensive knowledge about Grid technology. This paper describes the service-oriented framework of OLSGW and how it integrates a group of bio-informati...
. The Futures Lab group at Argonne National Laboratory and the University of Chicago are designin... more . The Futures Lab group at Argonne National Laboratory and the University of Chicago are designing, building, and evaluating a new type of interactive computing environment that couples in a deep way the concepts of direct manipulation found in virtual reality with the richness and variety of interactive devices found in ubiquitous computing. This environment provides the interactivity and collaboration support of teleimmersive environments with the flexibility and availability of desktop collaboration tools. We call these environments ActiveSpaces. An ActiveSpace is a physical domain that has been augmented with multiscale multiscreen displays, environment-specific and device-specific sensors, body and object trackers, human-input and instrument-input interfaces, streaming audio and video capture devices, and force feedback devices---and has then been connected to other such spaces via the Grid. 1 Toward the Evolution of ActiveSpaces The Futures Lab group at Argonne National Labora...
Immersive projection displays have played an important role in enabling large-format virtual real... more Immersive projection displays have played an important role in enabling large-format virtual reality systems such as the CAVE and CAVE like devices and the various immersive desks and desktop-like displays. However, these devices have played a minor role so far in advancing the sense of immersion for conferencing systems. The Access Grid project led by Argonne is exploring the use of large-scale projection based systems as the basis for building room oriented collaboration and semi-immersive visualization systems. We believe these multiprojector systems will become common infrastructure in the future, largely based on their value for enabling group-to-group collaboration in an environment that can also support large-format projector based visualization. Creating a strong sense of immersion is an important goal for future collaboration technologies. Immersion in conferencing applications implies that the users can rely on natural sight and audio cues to facilitate interactions with p...
Mean Time Between Failures (MTBF), now calculated in days or hours, is expected to drop to minute... more Mean Time Between Failures (MTBF), now calculated in days or hours, is expected to drop to minutes on exascale machines. The advancement of resilience technologies greatly depends on a deeper understanding of faults arising from hardware and software components. This understanding has the potential to help us build better fault tolerance technologies. For instance, it has been proved that combining checkpointing and failure prediction leads to longer checkpoint intervals, which in turn leads to fewer total checkpoints. In this paper we present a new approach for fault detection based on the Void Search (VS) algorithm. VS is used primarily in astrophysics for finding areas of space that have a very low density of galaxies. We evaluate our algorithm using real environmental logs from Mira Blue Gene/Q supercomputer at Argonne National Laboratory. Our experiments show that our approach can detect almost all faults (i.e., sensitivity close to 1) with a low false positive rate (i.e., spec...
2014 43rd International Conference on Parallel Processing Workshops, 2014
ABSTRACT Abstract—In-situ analysis has been proposed as a promising solution to glean faster insi... more ABSTRACT Abstract—In-situ analysis has been proposed as a promising solution to glean faster insights and reduce the amount of data to storage. A critical challenge here is that the reduced dataset is typically on a subset of the nodes and needs to be written out to storage. Data coupling in multiphysics codes also exhibits spare data movement pattern wherein data movement occurs among a subset of nodes. We evaluate performance of data movement for sparse data patterns and propose mechanisms to improve performance. Our mechanisms introduce intermediate nodes to implement multiple paths data transfer on top of default routing algorithms and utilize topologyaware data aggregation to avoid shared bottleneck links. The efficacy of our solutions is evaluated through microbenchmarks and application benchmarks on the IBM BlueGene/Q system scaling up to 131,072 compute cores. The results show that our algorithms achieve up to 2X improvement in achievable throughput compared to the default mechanisms.
The overarching goals of the ASC Flash Center are to (1) build a new generation community code fo... more The overarching goals of the ASC Flash Center are to (1) build a new generation community code for astrophysical simulations, especially those involving astrophysical thermonuclear flashes on compact stars; (2) do outstanding science on the problems of X-ray flashes, novae, and Type Ia supernovae; and (3) educate and train students in computational science. The Flash Center has achieved the following
ABSTRACT High-performance and energy-efficient data management applications are a necessity for H... more ABSTRACT High-performance and energy-efficient data management applications are a necessity for HPC systems due to the extreme scale of data produced by high fidelity scientific simulations that these systems support. Data layout in memory hugely impacts the performance. For better performance, most simulations interleave variables in memory during their calculation phase, but deinterleave the data for subsequent storage and analysis. As a result, efficient data deinterleaving is critical; yet, common deinterleaving methods provide inefficient throughput and energy performance. To address this problem, we propose a deinterleaving method that is high performance, energy efficient, and generic to any data type. To the best of our knowledge, this is the first deinterleaving method that 1) exploits data cache prefetching, 2) reduces memory accesses, and 3) optimizes the use of complete cache line writes. When evaluated against conventional deinterleaving methods on 105 STREAM standard micro-benchmarks, our method always improved throughput and throughput/watt on multi-core systems. In the best case, our deinterleaving method improved throughput up to 26.2x and throughput/watt up to 7.8x.
. User interfaces represent a doorknob to computer applications and directlyrelate to the usabili... more . User interfaces represent a doorknob to computer applications and directlyrelate to the usability of a given application. The desktop WIMP interfacehas been developed and refined over the past 15 years and works well for thedesktop environment. The question is, What does one do in the case of today'sthree-dimensional immersive environments? Does one translate the desktop toolsto the 3D space,
Journal of Parallel and Distributed Computing, 2015
ABSTRACT As high performance computing (HPC) continues to grow in scale and complexity, energy be... more ABSTRACT As high performance computing (HPC) continues to grow in scale and complexity, energy becomes a critical constraint in the race to exascale computing. The days of “performance at all cost” are coming to an end. While performance is still a major objective, future HPC will have to deliver desired performance under the energy constraint. Among various power management methods, power capping is a widely used approach. Unfortunately, the impact of power capping on system performance, user jobs, and power-performance efficiency are not well studied due to many interfering factors imposed by system workload and configurations. To fully understand power management in extreme scale systems with a fixed power budget, we introduce a power-performance modeling tool named PuPPET (Power Performance PETri net). Unlike the traditional performance modeling approaches such as analytical methods or trace-based simulators, we explore a new approach–colored Petri nets–for the design of PuPPET. PuPPET is fast and extensible for navigating through different configurations. More importantly, it can scale to hundreds of thousands of processor cores and at the same time provide high levels of modeling accuracy. We validate PuPPET by using system traces (i.e., workload log and power data) collected from the production 48-rack IBM Blue Gene/Q supercomputer at Argonne National Laboratory. Our trace-based validation demonstrates that PuPPET is capable of modeling the dynamic execution of parallel jobs on the machine by providing an accurate approximation of energy consumption. In addition, we present two case studies of using PuPPET to study power-performance tradeoffs on petascale systems.
Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion - SC '11 Companion, 2011
ABSTRACT Accurately modeling many physical and biological systems requires simulating at multiple... more ABSTRACT Accurately modeling many physical and biological systems requires simulating at multiple scales. This results in large and heterogeneous data sets on vastly differing scales, both physical and temporal. Here we look specifically at blood flow in a patient-specific cerebrovasculature with a brain aneurysm, and analyze the interaction of platelets with the arterial walls that lead to thrombus formation.
Advanced collaborative environments are one of the most important tools for interacting with coll... more Advanced collaborative environments are one of the most important tools for interacting with colleagues distributed around the world. However, heterogeneous characteristics such as network transfer rates, computational abilities, and hierarchical systems make the seamless integration of distributed resources a challenge. This paper proposes the design of two network services, Collaborative Environment Network Service Architecture (CENSA) and Infrastructure (CENSI), that embed network services into various systems intelligently and elastically and support seamless advanced collaborative environments. We present a multilayered model for their current utilization and future development. We describe various network services and discuss some open issues.
: We describe the use of the CAVE virtual reality visualization environment as an aid to the desi... more : We describe the use of the CAVE virtual reality visualization environment as an aid to the design of accelerator magnets. We have modeled an elliptical multipole wiggler magnet being designed for use at the Advanced Photon Source at Argonne National Laboratory. The CAVE environment allows us to explore and interact with the 3-D visualization of the magnet. Capabilities include changing the number of periods of the magnet displayed, changing the icons used for displaying the magnetic field, and changing the current in the electromagnet and observing the effect on the magnetic field and particle beam trajectory through the field. This work was supported by the Office of Scientific Computing, U.S. Department of Energy, under Contract W-31-109Eng -38. y This work was supported by the Office of Basic Energy Science, U.S. Department of Energy, under Contract W-31-109Eng -38. 1 Introduction Electromagnetic field analysis and design are more difficult in three dimensions than in two, b...
One of the major problems in three-dimensional (3-D) electromagnetic field computation is visuali... more One of the major problems in three-dimensional (3-D) electromagnetic field computation is visualizing the calculated field. Virtual reality techniques can be used as an aid to this process by providing multiple viewpoints, allowing immersion within the field, and taking advantage of the human ability to process 3-D spatial information. In this paper we present an example of 3-D electromagnetic field visualization in the CAVE virtual-reality environment. 1 Introduction Electromagnetic field analysis and design are more difficult in three dimensions than in two. Not only is the mathematics more complex (multiple-valued scalar potentials, gauge conditions on vector potentials), but so are the visualization aspects. The complexity arises from the greater amount of data (more mesh points, more field components per mesh point) and the desire to view the computational mesh and electromagnetic field calculations together. Scientific visualization of three-dimensional (3-D) vector fields, su...
One of the major problems in three-dimensional (3-D) electromagnetic field computation is visuali... more One of the major problems in three-dimensional (3-D) electromagnetic field computation is visualizing the calculated field. Virtual reality techniques can be used as an aid to this process by providing multiple viewpoints, allowing immersion within the field, and taking advantage of the human ability to process 3-D spatial information. In this paper we presentan example of 3-D electromagnetic field visualization in the CAVE virtual-realityenvironment. 1 Introduction Electromagnetic field analysis and design are more difficult in three dimensions than in two. Not only is the mathematics more complex (multiple-valued scalar potentials, gauge conditions on vector potentials), but so are the visualization aspects. The complexity arises from the greater amount of data (more mesh points, more field components per mesh point) and the desire to view the computational mesh and electromagnetic field calculations together. Scientific visualization of three-dimensional (3-D) vector fields, ...
2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013
ABSTRACT The Argonne Leadership Computing Facility (ALCF) is home to Mira, a 10 PF Blue Gene/Q (B... more ABSTRACT The Argonne Leadership Computing Facility (ALCF) is home to Mira, a 10 PF Blue Gene/Q (BG/Q) system. The BG/Q system is the third generation in Blue Gene architecture from IBM and like its predecessors combines system-onchip technology with a proprietary interconnect (5-D torus). Each compute node has 16 augmented PowerPC A2 processor cores with support for simultaneous multithreading, 4-wide double precision SIMD, and different data prefetching mechanisms. Mira offers several new opportunities for tuning and scaling scientific applications. This paper discusses our early experience with a subset of micro-benchmarks, MPI benchmarks, and a variety of science and engineering applications running at ALCF. Both performance and power are studied and results on BG/Q is compared with its predecessor BG/P. Several lessons gleaned from tuning applications on the BG/Q architecture for better performance and scalability are shared.
Most tools and algorithms of high-throughput bioinformatics data analysis are CPU-intensive, requ... more Most tools and algorithms of high-throughput bioinformatics data analysis are CPU-intensive, requiring high-end computing resources such as the TeraGrid. To facilitate the usage of the TeraGrid resources by the Life Science community for computing and data management, we have developed an integrated cyber computational environment named Open Life Science Gateway (OLSGW). OLSGW hides the complexity of accessing the distributed TeraGrid resources from biologists and bio-informaticians, and provides them with a user friendly web-service based interface. Since the user interface of OLSGW has the similar look and feel to bioinformatics web environments that have been available to the Life Science community for years, biologists can easily use this gateway to execute their analysis programs and compose computational workflow scripts without extensive knowledge about Grid technology. This paper describes the service-oriented framework of OLSGW and how it integrates a group of bio-informati...
. The Futures Lab group at Argonne National Laboratory and the University of Chicago are designin... more . The Futures Lab group at Argonne National Laboratory and the University of Chicago are designing, building, and evaluating a new type of interactive computing environment that couples in a deep way the concepts of direct manipulation found in virtual reality with the richness and variety of interactive devices found in ubiquitous computing. This environment provides the interactivity and collaboration support of teleimmersive environments with the flexibility and availability of desktop collaboration tools. We call these environments ActiveSpaces. An ActiveSpace is a physical domain that has been augmented with multiscale multiscreen displays, environment-specific and device-specific sensors, body and object trackers, human-input and instrument-input interfaces, streaming audio and video capture devices, and force feedback devices---and has then been connected to other such spaces via the Grid. 1 Toward the Evolution of ActiveSpaces The Futures Lab group at Argonne National Labora...
Immersive projection displays have played an important role in enabling large-format virtual real... more Immersive projection displays have played an important role in enabling large-format virtual reality systems such as the CAVE and CAVE like devices and the various immersive desks and desktop-like displays. However, these devices have played a minor role so far in advancing the sense of immersion for conferencing systems. The Access Grid project led by Argonne is exploring the use of large-scale projection based systems as the basis for building room oriented collaboration and semi-immersive visualization systems. We believe these multiprojector systems will become common infrastructure in the future, largely based on their value for enabling group-to-group collaboration in an environment that can also support large-format projector based visualization. Creating a strong sense of immersion is an important goal for future collaboration technologies. Immersion in conferencing applications implies that the users can rely on natural sight and audio cues to facilitate interactions with p...
Mean Time Between Failures (MTBF), now calculated in days or hours, is expected to drop to minute... more Mean Time Between Failures (MTBF), now calculated in days or hours, is expected to drop to minutes on exascale machines. The advancement of resilience technologies greatly depends on a deeper understanding of faults arising from hardware and software components. This understanding has the potential to help us build better fault tolerance technologies. For instance, it has been proved that combining checkpointing and failure prediction leads to longer checkpoint intervals, which in turn leads to fewer total checkpoints. In this paper we present a new approach for fault detection based on the Void Search (VS) algorithm. VS is used primarily in astrophysics for finding areas of space that have a very low density of galaxies. We evaluate our algorithm using real environmental logs from Mira Blue Gene/Q supercomputer at Argonne National Laboratory. Our experiments show that our approach can detect almost all faults (i.e., sensitivity close to 1) with a low false positive rate (i.e., spec...
2014 43rd International Conference on Parallel Processing Workshops, 2014
ABSTRACT Abstract—In-situ analysis has been proposed as a promising solution to glean faster insi... more ABSTRACT Abstract—In-situ analysis has been proposed as a promising solution to glean faster insights and reduce the amount of data to storage. A critical challenge here is that the reduced dataset is typically on a subset of the nodes and needs to be written out to storage. Data coupling in multiphysics codes also exhibits spare data movement pattern wherein data movement occurs among a subset of nodes. We evaluate performance of data movement for sparse data patterns and propose mechanisms to improve performance. Our mechanisms introduce intermediate nodes to implement multiple paths data transfer on top of default routing algorithms and utilize topologyaware data aggregation to avoid shared bottleneck links. The efficacy of our solutions is evaluated through microbenchmarks and application benchmarks on the IBM BlueGene/Q system scaling up to 131,072 compute cores. The results show that our algorithms achieve up to 2X improvement in achievable throughput compared to the default mechanisms.
The overarching goals of the ASC Flash Center are to (1) build a new generation community code fo... more The overarching goals of the ASC Flash Center are to (1) build a new generation community code for astrophysical simulations, especially those involving astrophysical thermonuclear flashes on compact stars; (2) do outstanding science on the problems of X-ray flashes, novae, and Type Ia supernovae; and (3) educate and train students in computational science. The Flash Center has achieved the following
ABSTRACT High-performance and energy-efficient data management applications are a necessity for H... more ABSTRACT High-performance and energy-efficient data management applications are a necessity for HPC systems due to the extreme scale of data produced by high fidelity scientific simulations that these systems support. Data layout in memory hugely impacts the performance. For better performance, most simulations interleave variables in memory during their calculation phase, but deinterleave the data for subsequent storage and analysis. As a result, efficient data deinterleaving is critical; yet, common deinterleaving methods provide inefficient throughput and energy performance. To address this problem, we propose a deinterleaving method that is high performance, energy efficient, and generic to any data type. To the best of our knowledge, this is the first deinterleaving method that 1) exploits data cache prefetching, 2) reduces memory accesses, and 3) optimizes the use of complete cache line writes. When evaluated against conventional deinterleaving methods on 105 STREAM standard micro-benchmarks, our method always improved throughput and throughput/watt on multi-core systems. In the best case, our deinterleaving method improved throughput up to 26.2x and throughput/watt up to 7.8x.
Uploads
Papers by Michael Papka