research-article

Liszt: a domain specific language for building portable mesh-based PDE solvers

Authors:

Zachary DeVito,

Francisco Palacios,

Stephen Oakley,

Montserrat Medina,

Mike Barrientos,

Karthik Duraisamy,

Pat HanrahanAuthors Info & Claims

SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis

Article No.: 9, Pages 1 - 12

https://doi.org/10.1145/2063384.2063396

Published: 12 November 2011 Publication History

Abstract

Heterogeneous computers with processors and accelerators are becoming widespread in scientific computing. However, it is difficult to program hybrid architectures and there is no commonly accepted programming model. Ideally, applications should be written in a way that is portable to many platforms, but providing this portability for general programs is a hard problem.

By restricting the class of programs considered, we can make this portability feasible. We present Liszt, a domain-specific language for constructing mesh-based PDE solvers. We introduce language statements for interacting with an unstructured mesh, and storing data at its elements. Program analysis of these statements enables our compiler to expose the parallelism, locality, and synchronization of Liszt programs. Using this analysis, we generate applications for multiple platforms: a cluster, an SMP, and a GPU. This approach allows Liszt applications to perform within 12% of hand-written C++, scale to large clusters, and experience order-of-magnitude speedups on GPUs.

References

[1]

J. R. Allwright, R. Bordawekar, P. D. Coddington, K. Dincer, and C. L. Martin. A comparison of parallel graph coloring algorithms. Technical report, SCCS-666, Northeast Parallel Architectures Center at Syracuse University, 1995.

[2]

C. Ancourt, F. Coelho, and R. Keryell. How to add a new phase in PIPS: the case of dead code elimination. In In Sixth International Workshop on Compilers for Parallel Computers, 1996.

[3]

V. G. Asouti, X. S. Trompoukis, I. C. Kampolis, and K. C. Giannakoglou. Unsteady CFD computations using vertex-centered finite volumes for unstructured grids on graphics processing units. International Journal for Numerical Methods in Fluids, 2010.

[4]

S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith. Efficient management of parallelism in object oriented numerical software libraries. In E. Arge, A. M. Bruaset, and H. P. Langtangen, editors, Modern Software Tools in Scientific Computing, pages 163--202. Birkhäuser Press, 1997.

Digital Library

[5]

K. J. Barker, K. Davis, A. Hoisie, D. J. Kerbyson, M. Lang, S. Pakin, and J. C. Sancho. Entering the petaflop era: the architecture and performance of Roadrunner. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, SC '08, Piscataway, NJ, USA, 2008. IEEE Press.

Digital Library

[6]

T. Brandvik and G. Pullan. SBLOCK: A framework for efficient stencil-based PDE solvers on multi-core platforms. In Computer and Information Technology (CIT), 2010 IEEE 10th International Conference on, pages 1181--1188, July 2010.

Digital Library

[7]

D. L. Brown, G. S. Chesshire, W. D. Henshaw, and D. J. Quinlan. OVERTURE: An object-oriented software system for solving partial differential equations in serial and parallel environments. In PPSC'97, 1997.

Digital Library

[8]

H. Chafi, Z. DeVito, A. Moors, T. Rompf, A. K. Sujeeth, P. Hanrahan, M. Odersky, and K. Olukotun. Language virtualization for heterogeneous parallel computing. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications, OOPSLA '10, pages 835--847, New York, NY, USA, 2010. ACM.

Digital Library

[9]

G. J. Chaitin, M. A. Auslander, A. K. Chandra, J. Cocke, M. E. Hopkins, and P. W. Markstein. Register allocation via coloring. Comput. Lang., pages 47--57, 1981.

[10]

A. Corrigan, F. Camelli, R. Löhner, and J. Wallin. Running unstructured grid CFD solvers on modern graphics hardware. In 19th AIAA Computational Fluid Dynamics Conference, number AIAA 2009-4001, June 2009.

[11]

D. P. Dobkin and M. J. Laszlo. Primitives for the manipulation of three-dimensional subdivisions. In Proceedings of the third annual symposium on Computational geometry, SCG '87, pages 86--99, New York, NY, USA, 1987. ACM.

Digital Library

[12]

J. B. Drake, W. Putman, P. N. Swarztrauber, and D. L. Williamson. High order cartesian method for the shallow water equations on a sphere. Technical report, TM-2001, Oakridge Nation Laboratory, 1999.

[13]

T. Dupont, J. Hoffman, C. Johnson, R. Kirby, M. Larson, A. Logg, and R. Scott. The FEniCS project. Technical report, 2003.

[14]

M. Giles, G. Mudalige, Z. Sharif, G. Markall, and P. Kelly. Performance analysis of the OP2 framework on many-core architecture. In ACM SIGMETRICS Performance Evaluation Review (to appear), March 2011.

Digital Library

[15]

W. Gropp, S. Huss-Ledermanand, A. Lumsdaine, E. Lusk, B. Nitzberg, W. Saphir, and M. Snir. MPI - The Complete Reference: Volume 2, The MPI-2 Extensions. MIT Press, Cambridge, MA, 1998.

[16]

R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz. Understanding sources of inefficiency in general-purpose chips. SIGARCH Comput. Archit. News, 38:37--47, June 2010.

Digital Library

[17]

M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps, A. G. Salinger, H. K. Thornquist, R. S. Tuminaro, J. M. Willenbring, A. Williams, and K. S. Stanley. An overview of the Trilinos project. ACM Trans. Math. Softw., 31:397--423, September 2005.

Digital Library

[18]

M. Houston, J.-Y. Park, M. Ren, T. Knight, K. Fatahalian, A. Aiken, W. Dally, and P. Hanrahan. A portable runtime interface for multi-level memory hierarchies. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, PPoPP '08, pages 143--152, New York, NY, USA, 2008. ACM.

Digital Library

[19]

A. Jameson, T. Baker, and N. Weatherill. Improvements to the aircraft Euler method. In AIAA 25th Aerospace Sciences Meeting, number 86-0103, January 1986.

[20]

I. Kampolis, X. Trompoukis, V. Asouti, and K. Giannakoglou. CFD-based analysis and two-level aerodynamic optimization on graphics processing units. Computer Methods in Applied Mechanics and Engineering, 199(9-12):712--722, 2010.

[21]

G. Karypis, V. Kumar, and V. Kumar. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. Journal of Parallel and Distributed Computing, 48:71--95, 1998.

Digital Library

[22]

Khronos OpenCL Working Group. The OpenCL Specification, version 1.0.29, 8 December 2008.

[23]

O. Lawlor, S. Chakravorty, T. Wilmarth, N. Choudhury, I. Dooley, G. Zheng, and L. Kale. ParFUM: a parallel framework for unstructured meshes for scalable dynamic physics applications. Engineering with Computers, 22:215--235, 2006.

Digital Library

[24]

A. W. Lim and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine partitions. In Parallel Computing, pages 201--214. ACM Press, 1998.

Digital Library

[25]

R. Löhner. Applied Computational Fluid Dynamics: An Introduction Based on Finite Element Methods. Wiley, Fairfax, Virginia, 2nd edition, 2008.

[26]

J. Nickolls, I. Buck, M. Garland, and K. Skadron. Scalable parallel programming with CUDA. Queue, 6:40--53, March 2008.

Digital Library

[27]

NVIDIA Corporation. NVIDIA's next generation compute architecture: Fermi, November 2009.

[28]

NVIDIA Corporation. NVIDIA Tesla GPUs power world's fastest supercomputer, 2010.

[29]

M. Odersky, V. Cremet, I. Dragos, G. Dubochet, B. Emir, S. Mcdirmid, S. Micheloud, N. Mihaylov, M. Schinz, E. Stenman, L. Spoon, and M. Zenger. An overview of the Scala programming language (second edition. Technical report, LAMP-REPORT-2006-001, École Polytechnique Fédérale de Lausanne, 2006.

[30]

OpenMP Architecture Review Board. OpenMP: Application Program Interface 3.1, July 2011.

[31]

R. Pecnik, V. E. Terrapon, F. Ham, and G. Iaccarino. Full system scramjet simulation. Annual Research Briefs of the Center for Turbulence Research, Stanford University, Stanford, CA, 2009.

[32]

O. Pironneau, F. Hecht, A. L. Hyaric, and J. Morice. FreeFEM, 2005. Universitè Pierre et Marie Curie Laboratoire Jacques-Louis Lions, http://www.freefem.org/.

[33]

T. Rompf and M. Odersky. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs. In Proceedings of the ninth international conference on Generative programming and component engineering, GPCE '10, pages 127--136, New York, NY, USA, 2010. ACM.

Digital Library

[34]

D. E. Shaw, R. O. Dror, J. K. Salmon, J. P. Grossman, K. M. Mackenzie, J. A. Bank, C. Young, M. M. Deneroff, B. Batson, K. J. Bowers, E. Chow, M. P. Eastwood, D. J. Ierardi, J. L. Klepeis, J. S. Kuskin, R. H. Larson, K. Lindorff-Larsen, P. Maragakis, M. A. Moraes, S. Piana, Y. Shan, and B. Towles. Millisecond-scale molecular dynamics simulations on Anton. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 39:1--39:11, New York, NY, USA, 2009. ACM.

Digital Library

[35]

J. R. Stewart and H. C. Edwards. A framework approach for developing parallel adaptive multiphysics applications. Finite Elem. Anal. Des., 40:1599--1617, July 2004.

Digital Library

[36]

H. G. Weller, G. Tabor, H. Jasak, and C. Fureby. A tensorial approach to computational continuum mechanics using object-oriented techniques. Comput. Phys., 12:620--631, November 1998.

Digital Library

Cited By

Herholz PStuyck TKavan L(2024)A Mesh-based Simulation Framework using Automatic Code GenerationACM Transactions on Graphics10.1145/368798643:6(1-17)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687986
Li YKamil SCrane KJacobson AGingold Y(2024)IMESH: A DSL for Mesh ProcessingACM Transactions on Graphics10.1145/366218143:5(1-17)Online publication date: 25-Jun-2024
https://dl.acm.org/doi/10.1145/3662181
Bisbas GLydike ABauer EBrown NFehr MMitchell LRodriguez-Canal GJamieson MKelly PSteuwer MGrosser TTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)A shared compilation stack for distributed-memory parallelism in stencil DSLsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651344(38-56)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651344
Show More Cited By

Index Terms

Liszt: a domain specific language for building portable mesh-based PDE solvers
1. Applied computing
  1. Physical sciences and engineering
    1. Aerospace
    2. Engineering
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Source code generation
    2. Context specific languages
      1. Specialized application languages

Recommendations

Evaluation of Rodinia Codes on Intel Xeon Phi
ISMS '13: Proceedings of the 2013 4th International Conference on Intelligent Systems, Modelling and Simulation

High performance computing (HPC) is a niche area where various parallel benchmarks are constantly used to explore and evaluate the performance of Heterogeneous computing systems on the horizon. The Rodinia benchmark suite, a collection of parallel ...
Vectorizing Unstructured Mesh Computations for Many-core Architectures
PMAM'14: Proceedings of Programming Models and Applications on Multicores and Manycores

Achieving optimal performance on the latest multi-core and many-core architectures depends more and more on making efficient use of the hardware's vector processing capabilities. While auto-vectorizing compilers do not require the use of vector ...
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance Computing

The graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis

November 2011

866 pages

ISBN:9781450307710

DOI:10.1145/2063384

Conference Chair:
Scott Lathrop
University of Chicago
,
Program Chairs:
Jim Costa
Sandia National Laboratories
,
William Kramer
National Center for Supercomputing Applications

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE-CS: Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SC '11

Sponsor:

SIGARCH
IEEE-CS

SC '11: International Conference for High Performance Computing, Networking, Storage and Analysis

November 12 - 18, 2011

Washington, Seattle

Acceptance Rates

SC '11 Paper Acceptance Rate 74 of 352 submissions, 21%;

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

175
Total Citations
View Citations
879
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)9

Reflects downloads up to 12 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Herholz PStuyck TKavan L(2024)A Mesh-based Simulation Framework using Automatic Code GenerationACM Transactions on Graphics10.1145/368798643:6(1-17)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687986
Li YKamil SCrane KJacobson AGingold Y(2024)IMESH: A DSL for Mesh ProcessingACM Transactions on Graphics10.1145/366218143:5(1-17)Online publication date: 25-Jun-2024
https://dl.acm.org/doi/10.1145/3662181
Bisbas GLydike ABauer EBrown NFehr MMitchell LRodriguez-Canal GJamieson MKelly PSteuwer MGrosser TTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)A shared compilation stack for distributed-memory parallelism in stencil DSLsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651344(38-56)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651344
Mitenkov GMagkanaris IAwile OKumbhar PSchürmann FDonaldson AVerbrugge CLhoták OShen X(2023)MOD2IR: High-Performance Code Generation for a Biophysically Detailed Neuronal Simulation DSLProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580268(203-215)Online publication date: 17-Feb-2023
https://dl.acm.org/doi/10.1145/3578360.3580268
Yu CXu YKuang YHu YLiu T(2022)MeshTaichiACM Transactions on Graphics10.1145/3550454.355543041:6(1-17)Online publication date: 30-Nov-2022
https://dl.acm.org/doi/10.1145/3550454.3555430
Sharma AAfoakwa RIgnjatovic ZHuang MSalapura VZahran MChong FTang L(2022)Increasing ising machine capacity with multi-chip architecturesProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527414(508-521)Online publication date: 18-Jun-2022
https://dl.acm.org/doi/10.1145/3470496.3527414
Macià SMartínez-Ferrer PAyguadé EBeltran V(2022)Automated generation of High-Performance Computational Fluid Dynamics CodesJournal of Computational Science10.1016/j.jocs.2022.10166461(101664)Online publication date: May-2022
https://doi.org/10.1016/j.jocs.2022.101664
Kuznetsov AMullia KXu ZHašan MRamamoorthi R(2021)NeuMIPACM Transactions on Graphics10.1145/3450626.345979540:4(1-13)Online publication date: 19-Jul-2021
https://dl.acm.org/doi/10.1145/3450626.3459795
Liao WChen RHua YLiu LWeber O(2021)Real-time locally injective volumetric deformationACM Transactions on Graphics10.1145/3450626.345979440:4(1-16)Online publication date: 19-Jul-2021
https://dl.acm.org/doi/10.1145/3450626.3459794
Chen HPark HMacit KKavan L(2021)Capturing detailed deformations of moving human bodiesACM Transactions on Graphics10.1145/3450626.345979240:4(1-18)Online publication date: 19-Jul-2021
https://dl.acm.org/doi/10.1145/3450626.3459792
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents