Maintaining Trust in Reduction: Preserving the Accuracy of Quantities of Interest for Lossy Compression

Gong, Qian; Liang, Xin; Whitney, Ben; Choi, Jong Youl; Chen, Jieyang; Wan, Lipeng; Ethier, Stéphane; Ku, Seung-Hoe; Churchill, R. Michael; Chang, C. -S.; Ainsworth, Mark; Tugluk, Ozan; Munson, Todd; Pugmire, David; Archibald, Richard; Klasky, Scott

doi:10.1007/978-3-030-96498-6_2

Qian Gong¹²,
Xin Liang¹³,
Ben Whitney¹²,
Jong Youl Choi¹²,
Jieyang Chen¹²,
Lipeng Wan¹²,
Stéphane Ethier¹⁴,
Seung-Hoe Ku¹⁴,
R. Michael Churchill¹⁴,
C. -S. Chang¹⁴,
Mark Ainsworth¹⁵,
Ozan Tugluk¹⁵,
Todd Munson¹⁶,
David Pugmire¹²,
Richard Archibald¹² &
…
Scott Klasky¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1512))

Included in the following conference series:

Smoky Mountains Computational Sciences and Engineering Conference

1157 Accesses
7 Citations

Abstract

As the growth of data sizes continues to outpace computational resources, there is a pressing need for data reduction techniques that can significantly reduce the amount of data and quantify the error incurred in compression. Compressing scientific data presents many challenges for reduction techniques since it is often on non-uniform or unstructured meshes, is from a high-dimensional space, and has many Quantities of Interests (QoIs) that need to be preserved. To illustrate these challenges, we focus on data from a large scale fusion code, XGC. XGC uses a Particle-In-Cell (PIC) technique which generates hundreds of PetaBytes (PBs) of data a day, from thousands of timesteps. XGC uses an unstructured mesh, and needs to compute many QoIs from the raw data, f.

One critical aspect of the reduction is that we need to ensure that QoIs derived from the data (density, temperature, flux surface averaged momentums, etc.) maintain a relative high accuracy. We show that by compressing XGC data on the high-dimensional, nonuniform grid on which the data is defined, and adaptively quantizing the decomposed coefficients based on the characteristics of the QoIs, the compression ratios at various error tolerances obtained using a multilevel compressor (MGARD) increases more than ten times. We then present how to mathematically guarantee that the accuracy of the QoIs computed from the reduced f is preserved during the compression. We show that the error in the XGC density can be kept under a user-specified tolerance over 1000 timesteps of simulation using the mathematical QoI error control theory of MGARD, whereas traditional error control on the data to be reduced does not guarantee the accuracy of the QoIs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Handling Large Numerical Data-Sets: Viability of a Lossy Compressor for CFD-simulations

High-Performance Spatial Data Compression for Scientific Applications

The Effect of Lossy Data Compression in Computational Fluid Dynamics Applications: Resilience and Data Postprocessing

References

Chang, C.-S., et al.: Spontaneous rotation sources in a quiescent tokamak edge plasma. Phys. Plasmas 15(6), 062510 (2008)
Article Google Scholar
Chang, C.-S., et al.: Compressed ion temperature gradient turbulence in diverted tokamak edge. Phys. Plasmas 16(5), 056108 (2009)
Article Google Scholar
Hager, R., et al.: Gyrokinetic study of collisional resonant magnetic perturbation (RMP)-driven plasma density and heat transport in tokamak edge plasma using a magnetohydrodynamic screened RMP field. Nucl. Fusion 59(12), 126009 (2019)
Article Google Scholar
Jesse, S., et al.: Using multivariate analysis of scanning-Rochigram data to reveal material functionality. Microsc. Microanal. 22(S3), 292–293 (2016)
Article Google Scholar
https://www.olcf.ornl.gov/2021/02/18/scientists-use-supercomputers-tostudy-reliable-fusion-reactor-design-operation (2021, Online)
Rebut, P.-H.: ITER: the first experimental fusion reactor. Fusion Eng. Des. 30(1–2), 85–118 (1995)
Article Google Scholar
Ku, S.-H., et al.: Full-f gyrokinetic particle simulation of centrally heated global ITG turbulence from magnetic axis to edge pedestal top in a realistic tokamak geometry. Nucl. Fusion 49(11), 115021 (2009)
Article Google Scholar
Dominski, J., et al.: Spatial coupling of gyrokinetic simulations, a generalized scheme based on first-principles. Phys. Plasmas 28(2), 022301 (2021)
Article Google Scholar
Wolfram Jr, et al.: Global to Coastal Multiscale Modeling via Land-river-ocean Coupling in the Energy Exascale Earth System Model (E3SM). No. LA-UR-20-24263. Los Alamos National Lab. (LANL), Los Alamos, NM (United States) (2020)
Google Scholar
Ratanaworabhan, P., et al.: Fast lossless compression of scientific floating-point data. In: Data Compression Conference, DCC 2006 (2006)
Google Scholar
Liang, X., et al.: Error-controlled lossy compression optimized for high compression ratios of scientific datasets. In: 2018 IEEE International Conference on Big Data (Big Data). IEEE (2018)
Google Scholar
Lindstrom, P.: Fixed-rate compressed floating-point arrays. IEEE Trans. Vis. Comput. Graph. 20(12), 2674–2683 (2014)
Article Google Scholar
Ainsworth, M., et al.: Multilevel techniques for compression and reduction of scientific data-the multivariate case. SIAM J. Sci. Comput. 41(2), A1278–A1303 (2019)
Article MathSciNet Google Scholar
Ainsworth, M., et al.: Multilevel techniques for compression and reduction of scientific data-quantitative control of accuracy in derived quantities. SIAM J. Sci. Comput. 41(4), A2146–A2171 (2019)
Article MathSciNet Google Scholar
Ainsworth, M., et al.: Multilevel techniques for compression and reduction of scientific data-the unstructured case. SIAM J. Sci. Comput. 42(2), A1402–A1427 (2020)
Article MathSciNet Google Scholar
Choi, J., et al.: Generative fusion data compression. In: Neural Compression: From Information Theory to Applications-Workshop ICLR (2021)
Google Scholar
https://github.com/CODARcode/MGARD/blob/master/README_MGARD_GPU.md
https://github.com/LLNL/zfp
https://github.com/szcompressor/SZ
Hines, J.: Stepping up to summit. Comput. Sci. Eng. 20(2), 78–82 (2018)
Article Google Scholar
Faghihi, D., et al.: Moment preserving constrained resampling with applications to particle-in-cell methods. J. Comput. Phys. 409, 109317 (2020)
Article MathSciNet Google Scholar
Jackson, M., et al.: Reservoir modeling for flow simulation by use of surfaces, adaptive unstructured meshes, and an overlapping-control-volume finite-element method. SPE Reservoir Eval. Eng. 18(02), 115–132 (2015)
Article Google Scholar
Alted, F.: Blosc, an extremely fast, multi-threaded, meta-compressor library (2017)
Google Scholar
Burtscher, M., et al.: FPC: a high-speed compressor for double-precision floating-point data. IEEE Trans. Comput. 58(1), 18–31 (2008)
Article MathSciNet Google Scholar
https://facebook.github.io/zstd/. Accessed 2021
Chen, J., et al.: Understanding performance-quality trade-offs in scientific visualization workflows with lossy compression. In: 2019 IEEE/ACM 5th International Workshop on Data Analysis and Reduction for Big Scientific Data (2019)
Google Scholar
Lu, T., et al.: Understanding and modeling lossy compression schemes on HPC scientific data. In: 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE (2018)
Google Scholar
Liang, X., et al.: MGARD+: optimizing multi-grid based reduction for efficient scientific data management. IEEE Trans. Comput. (2021, to appear)
Google Scholar
Chen, J., et al.: Accelerating Multigrid-Based Hierarchical Scientific Data Refactoring on GPUs. arXiv preprint arXiv:2007.04457 (2020)
Tian, J., et al.: cuSZ: an efficient GPU-based error-bounded lossy compression framework for scientific data. In: Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques (2020)
Google Scholar
Lindstrom, P., et al.: cuZFP. https://github.com/LLNL/zfp/tree/develop/src/cuda_zfp
Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)
Google Scholar
Rabbani, M.: JPEG2000: image compression fundamentals, standards and practice. J. Electron. Imaging 11(2), 286 (2002)
Article Google Scholar

Download references

Acknowledgement

This research was supported by the ECP CODAR, Sirius-2, and RAPIDS-2 projects through the Advanced Scientific Computing Research (ASCR) program of Department of Energy, and the LDRD project through DRD program of Oak Ridge National Laboratory.

Author information

Authors and Affiliations

Oak Ridge National Laboratory, Oak Ridge, TN, 37830, USA
Qian Gong, Ben Whitney, Jong Youl Choi, Jieyang Chen, Lipeng Wan, David Pugmire, Richard Archibald & Scott Klasky
Missouri University of Science and Technology, Rolla, MO, 65409, USA
Xin Liang
Princeton Plasma Physics Laboratory, Princeton, NJ, 08540, USA
Stéphane Ethier, Seung-Hoe Ku, R. Michael Churchill & C. -S. Chang
Brown University, Providence, RI, 02912, USA
Mark Ainsworth & Ozan Tugluk
Argonne National Laboratory, Lemont, IL, 60439, USA
Todd Munson

Authors

Qian Gong
View author publications
You can also search for this author in PubMed Google Scholar
Xin Liang
View author publications
You can also search for this author in PubMed Google Scholar
Ben Whitney
View author publications
You can also search for this author in PubMed Google Scholar
Jong Youl Choi
View author publications
You can also search for this author in PubMed Google Scholar
Jieyang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lipeng Wan
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Ethier
View author publications
You can also search for this author in PubMed Google Scholar
Seung-Hoe Ku
View author publications
You can also search for this author in PubMed Google Scholar
R. Michael Churchill
View author publications
You can also search for this author in PubMed Google Scholar
C. -S. Chang
View author publications
You can also search for this author in PubMed Google Scholar
Mark Ainsworth
View author publications
You can also search for this author in PubMed Google Scholar
Ozan Tugluk
View author publications
You can also search for this author in PubMed Google Scholar
Todd Munson
View author publications
You can also search for this author in PubMed Google Scholar
David Pugmire
View author publications
You can also search for this author in PubMed Google Scholar
Richard Archibald
View author publications
You can also search for this author in PubMed Google Scholar
Scott Klasky
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qian Gong .

Editor information

Editors and Affiliations

Oak Ridge National Laboratory, Oak Ridge, TN, USA
Jeffrey Nichols
Oak Ridge National Laboratory, Oak Ridge, TN, USA
Arthur ‘Barney’ Maccabe
Oak Ridge National Laboratory, Oak Ridge, TN, USA
James Nutaro
Oak Ridge National Laboratory, Oak Ridge, TN, USA
Swaroop Pophale
Oak Ridge National Laboratory, Oak Ridge, TN, USA
Pravallika Devineni
Oak Ridge National Laboratory, Oak Ridge, TN, USA
Theresa Ahearn
Oak Ridge National Laboratory, Oak Ridge, TN, USA
Becky Verastegui

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gong, Q. et al. (2022). Maintaining Trust in Reduction: Preserving the Accuracy of Quantities of Interest for Lossy Compression. In: Nichols, J., et al. Driving Scientific and Engineering Discoveries Through the Integration of Experiment, Big Data, and Modeling and Simulation. SMC 2021. Communications in Computer and Information Science, vol 1512. Springer, Cham. https://doi.org/10.1007/978-3-030-96498-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-96498-6_2
Published: 10 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96497-9
Online ISBN: 978-3-030-96498-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Maintaining Trust in Reduction: Preserving the Accuracy of Quantities of Interest for Lossy Compression

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Handling Large Numerical Data-Sets: Viability of a Lossy Compressor for CFD-simulations

High-Performance Spatial Data Compression for Scientific Applications

The Effect of Lossy Data Compression in Computational Fluid Dynamics Applications: Resilience and Data Postprocessing

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Maintaining Trust in Reduction: Preserving the Accuracy of Quantities of Interest for Lossy Compression

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Handling Large Numerical Data-Sets: Viability of a Lossy Compressor for CFD-simulations

High-Performance Spatial Data Compression for Scientific Applications

The Effect of Lossy Data Compression in Computational Fluid Dynamics Applications: Resilience and Data Postprocessing

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation