Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Verifying spatial properties of array computations

Published: 12 October 2017 Publication History

Abstract

Arrays computations are at the core of numerical modelling and computational science applications. However, low-level manipulation of array indices is a source of program error. Many practitioners are aware of the need to ensure program correctness, yet very few of the techniques from the programming research community are applied by scientists. We aim to change that by providing targetted lightweight verification techniques for scientific code. We focus on the all too common mistake of array offset errors as a generalisation of off-by-one errors. Firstly, we report on a code analysis study on eleven real-world computational science code base, identifying common idioms of array usage and their spatial properties. This provides much needed data on array programming idioms common in scientific code. From this data, we designed a lightweight declarative specification language capturing the majority of array access patterns via a small set of combinators. We detail a semantic model, and the design and implementation of a verification tool for our specification language, which both checks and infers specifications. We evaluate our tool on our corpus of scientific code. Using the inference mode, we found roughly 87,000 targets for specification across roughly 1.1 million lines of code, showing that the vast majority of array computations read from arrays in a pattern with a simple, regular, static shape. We also studied the commit logs of one of our corpus packages, finding past bug fixes for which our specification system distinguishes the change and thus could have been applied to detect such bugs.

Supplementary Material

Auxiliary Archive (oopsla17-oopsla85-aux.zip)

References

[1]
T. Abe, T. Maeda, and M. Sato. 2013. Model Checking Stencil Computations Written in a Partitioned Global Address Space Language. In Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), 2013 IEEE 27th International. 365–374.
[2]
J. Adams. 1991. MUDPACK: multigrid software for linear elliptic partial differential equations, version 3.0. National Center for Atmospheric Research, Boulder, Colorado. Scientific Computing Division User Doc.
[3]
Krste Asanovi` c, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and Katherine A. Yelick. 2006. The Landscape of Parallel Computing Research: A View from Berkeley. Technical Report UCB/EECS-2006-183. EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS- 2006- 183.html
[4]
Terry Barker, Haoran Pan, Jonathan Kohler, Rachel Warren, and Sarah Winne. 2006. Decarbonizing the Global Economy with Induced Technological Change: Scenarios to 2100 using E3MG. The Energy Journal 0, Special I (2006), 241–258. https://ideas.repec.org/a/aen/journl/2006se- a12.html
[5]
Patrick Baudin, Jean-Christophe Filliâtre, Claude Marché, Benjamin Monate, Yannick Moy, and Virgile Prevosto. 2008. ACSL: ANSI C Specification Language. (2008).
[6]
Isabelle Bey, Daniel J Jacob, Robert M Yantosca, Jennifer A Logan, Brendan D Field, Arlene M Fiore, Qinbin Li, Honguy Y Liu, Loretta J Mickley, and Martin G Schultz. 2001. Global modeling of tropospheric chemistry with assimilated meteorology: Model description and evaluation. Journal of Geophysical Research: Atmospheres 106, D19 (2001), 23073–23095.
[7]
L Susan Blackford, Antoine Petitet, Roldan Pozo, Karin Remington, R Clint Whaley, James Demmel, Jack Dongarra, Iain Duff, Sven Hammarling, Greg Henry, and others. 2002. An updated set of basic linear algebra subprograms (BLAS). ACM Trans. Math. Software 28, 2 (2002), 135–151.
[8]
Stefan Blom, Marieke Huisman, and Matej MihelÄŊiÄĞ. 2014. Specification and verification of GPGP U programs. Science of Computer Programming 95, Part 3 (2014), 376 – 388.
[9]
Robert D Blumofe, Christopher F Joerg, Bradley C Kuszmaul, Charles E Leiserson, Keith H Randall, and Yuli Zhou. 1996. Cilk: An efficient multithreaded runtime system. Journal of parallel and distributed computing 37, 1 (1996), 55–69.
[10]
Jochen Burghardt, J Gerlach, L Gu, Kerstin Hartig, Hans Pohl, J Soto, and K Völlinger. 2010. ACSL by example, towards a verified C standard library. DEVICESOFT project publication. Fraunhofer FIRST Institute (December 2011) (2010).
[11]
Mistral Contrastin, Matthew Danish, Dominic Orchard, and Andrew Rice. 2016. Lightning Talk: Supporting Software Sustainability with Lightweight Specifications. In Proceedings of the Fourth Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE4), University of Manchester, Manchester, UK, September 12–14, Vol. 1686. CEUR Workshop Proceedings.
[12]
Mistral Contrastin, Matthew Danish, Dominic Orchard, and Andrew Rice. 2017. CamFort - refactoring, analysis, and verification tool for scientific Fortran programs. https://camfort.github.com . (2017). Accessed: 23rd August 2017.
[13]
Larry S Davis. 1975. A survey of edge detection techniques. Computer graphics and image processing 4, 3 (1975), 248–270.
[14]
C. Dawson, Q. Du, and T. Dupont. 1991. A finite difference domain decomposition algorithm for numerical solution of the heat equation. Math. Comp. 57, 195 (1991).
[15]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337–340.
[16]
PE Farrell, MD Piggott, GJ Gorman, DA Ham, and CR Wilson. 2010. Automated continuous verification and validation for numerical simulation. Geoscientific Model Development Discussions 3 (2010), 1587–1623.
[17]
Andrew D. Friend and Andrew White. 2000. Evaluation and analysis of a dynamic terrestrial ecosystem model under preindustrial conditions at the global scale. Global Biogeochemical Cycles 14, 4 (2000), 1173–1190.
[18]
M. Griebel, T. Dornsheifer, and T. Neunhoeffer. 1997. Numerical simulation in fluid dynamics: a practical introduction. Vol. 3. Society for Industrial Mathematics.
[19]
Shoaib Kamil, Alvin Cheung, Shachar Itzhaky, and Armando Solar-Lezama. 2016. Verified Lifting of Stencil Computations. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 711–726.
[20]
Dimitri Komatitsch, Jeroen Tromp, and others. 2016. SPECFEM3D. https://github.com/geodynamics/specfem3d . (2016). Accessed: 15 November 2016.
[21]
Stas Negara, Mohsen Vakilian, Nicholas Chen, Ralph E Johnson, and Danny Dig. 2012. Is it dangerous to use version control histories to study source code evolution?. In ECOOP, Vol. 12. Springer, 79–103.
[22]
W.L. Oberkampf and C.J. Roy. 2010. Verification and validation in scientific computing. Cambridge University Press.
[23]
Dominic Orchard, Mistral Contrastin, Matthew Danish, and Andrew Rice. 2017. Proofs for ‘Verifying Spatial Properties of Array Computations’. Technical Report UCAM-CL-TR-911. University of Cambridge, Computer Laboratory, 15 JJ Thomson Avenue, Cambridge CB3 0FD, United Kingdom.
[24]
Dominic Orchard and Andrew Rice. 2014. A computational science agenda for programming language research. Procedia Computer Science 29 (2014), 713–727.
[25]
Tao Pang. 1999. An introduction to computational physics. (1999). 1st Edition.
[26]
D.E. Post and L.G. Votta. 2005. Computational science demands a new paradigm. Physics today 58, 1 (2005), 35–41.
[27]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. ACM SIGPLAN Notices 48, 6 (2013), 519–530.
[28]
G.W. Recktenwald. 2004. Finite-difference approximations to the heat equation. Class Notes (2004). http://www.f.kth.se/ ~jjalap/numme/FDheat.pdf .
[29]
Armando Solar-Lezama, Gilad Arnold, Liviu Tancau, Rastislav Bodik, Vijay Saraswat, and Sanjit Seshia. 2007. Sketching Stencils. SIGPLAN Not. 42, 6 (June 2007), 167–178.
[30]
David Sorenson, Richard Lehoucq, Chao Yang, Kristi Maschhoff, Sylvestre Ledru, and Allan Cornet. 2017. ARPACK-NG. https://github.com/opencollab/arpack- ng . (2017).
[31]
Yuan Tang, Rezaul Alam Chowdhury, Bradley C Kuszmaul, Chi-Keung Luk, and Charles E Leiserson. 2011. The Pochoir Stencil Compiler. In Proceedings of the twenty-third annual ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 117–128.
[32]
Elena Tolkova. 2014. Land–Water Boundary Treatment for a Tsunami Model With Dimensional Splitting. Pure and Applied Geophysics 171, 9 (2014), 2289–2314.
[33]
Philip Wadler. 1990. Linear types can change the world. In IFIP TC, Vol. 2. Citeseer, 347–359.
[34]
David A Wheeler. 2001. SLOCCount. (2001).
[35]
Damian R Wilson and Susan P Ballard. 1999. A microphysically based precipitation scheme for the UK Meteorological Office Unified Model. Quarterly Journal of the Royal Meteorological Society 125, 557 (1999), 1607–1636.

Cited By

View all
  • (2020)Intrepydd: performance, productivity, and portability for data science application kernelsProceedings of the 2020 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3426428.3426915(65-83)Online publication date: 18-Nov-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 1, Issue OOPSLA
October 2017
1786 pages
EISSN:2475-1421
DOI:10.1145/3152284
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2017
Published in PACMPL Volume 1, Issue OOPSLA

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. arrays
  2. scientific computing
  3. static-analysis
  4. stencils
  5. verification

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)100
  • Downloads (Last 6 weeks)16
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Intrepydd: performance, productivity, and portability for data science application kernelsProceedings of the 2020 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3426428.3426915(65-83)Online publication date: 18-Nov-2020

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media