research-article

Legion: expressing locality and independence with logical regions

Authors:

Sean Treichler,

Elliott Slaughter,

Alex AikenAuthors Info & Claims

SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Article No.: 66, Pages 1 - 11

Published: 10 November 2012 Publication History

Abstract

Modern parallel architectures have both heterogeneous processors and deep, complex memory hierarchies. We present Legion, a programming model and runtime system for achieving high performance on these machines. Legion is organized around logical regions, which express both locality and independence of program data, and tasks, functions that perform computations on regions. We describe a runtime system that dynamically extracts parallelism from Legion programs, using a distributed, parallel scheduling algorithm that identifies both independent tasks and nested parallelism. Legion also enables explicit, programmer controlled movement of data through the memory hierarchy and placement of tasks based on locality information via a novel mapping interface. We evaluate our Legion implementation on three applications: fluid-flow on a regular grid, a three-level AMR code solving a heat diffusion equation, and a circuit simulation.

References

[1]

K. Fatahalian et al., "Sequoia: Programming the Memory Hierarchy," in Supercomputing, November 2006.

Digital Library

[2]

D. Callahan, B. L. Chamberlain, and H. P. Zima, "The Cascade high productivity language," in High-Level Parallel Programming Models and Supportive Environments, 2004, pp. 52--60.

[3]

W. Carlson, J. Draper, D. Culler, K. Yelick, E. Brooks, and K. Warren, "Introduction to UPC and language specification," UC Berkeley Technical Report: CCS-TR-99-157, 1999.

[4]

J. Vetter et al., "Keeneland: Bringing heterogeneous gpu computing to the computational science community," Comp. in Science Eng., 2011.

Digital Library

[5]

R. Blumofe, C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, and Y. Zhou, "Cilk: An efficient multithreaded runtime system," in Symposium on Principles and Practice of Parallel Programming, 1995.

Digital Library

[6]

"Cuda programming guide 4.1," http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf, January 2012.

[7]

K. Yelick et al., "Productivity and performance using partitioned global address space languages," in PASCO, 2007, pp. 24--32.

Digital Library

[8]

C. Bienia, "Benchmarking modern multiprocessors," Ph.D. dissertation, Princeton University, January 2011.

Digital Library

[9]

A. N. M. Lijewski and J. Bell, "Boxlib," https://ccse.lbl.gov/BoxLib/index.html, 2011.

[10]

M. Bauer, J. Clark, E. Schkufza, and A. Aiken, "Programming the memory hierarchy revisited: Supporting irregular parallelism in Sequoia," in PPoPP, 2011, pp. 13--24.

Digital Library

[11]

J. M. Perez, R. M. Badia, and J. Labarta, "Handling task dependencies under strided and aliased references," in ICS, 2010, pp. 263--274.

Digital Library

[12]

B. Chamberlain, D. Callahan, and H. Zima, "Parallel Programmability and the Chapel Language," Int'l Journal of High Performance Computing Applications, vol. 21, no. 3, pp. 291--312, August 2007.

Digital Library

[13]

B. Chamberlain, S. Choi, S. Deitz, D. Iten, and V. Litvinov, "Authoring User-Defined Domain Maps in Chapel," 2011.

[14]

A. Sidelnik et al., "Using the High Productivity Language Chapel to Target GPGPU Architectures," 2011.

[15]

P. Charles et al., "X10: An object-oriented approach to non-uniform cluster computing," in OOPSLA, 2005, pp. 519--538.

Digital Library

[16]

S. Chandra et al., "Type inference for locality analysis of distributed data structures," in PPoPP, 2008, pp. 11--22.

Digital Library

[17]

M. Joyner, Z. Budimlic, and V. Sarkar, "Subregion analysis and bounds check elimination for high level arrays," in Compiler Construction, 2011.

Digital Library

[18]

"X10 2.1 cuda support," x10.codehaus.org/X10+2.1+CUDA, 2011.

[19]

R. Bocchino, V. Adve, D. Dig, S. Adve, S. Heumann, R. Komuravelli, J. Overbey, P. Simmons, H. Sung, and M. Vakilian, "A type and effect system for deterministic parallel Java," in OOPSLA, 2009, pp. 97--116.

Digital Library

[20]

R. Lublinerman, S. Chaudhuri, and P. Cerny, "Parallel programming with object assemblies," in OOPSLA, 2009, pp. 61--80.

Digital Library

[21]

M. Kulkarni, K. Pingali, G. Ramanarayanan, B. Walter, K. Bala, and L. P. Chew, "Optimistic parallelism benefits from data partitioning," in ASPLOS, 2008, pp. 233--243.

Digital Library

[22]

H. Vandierendonch, G. Tzenakis, and D. Nikolopoulos, "A unified scheduler for recursive and task dataflow parallelism," in PACT, 2011.

Digital Library

[23]

G. Tzenakis et al., "BDDT: Block-level dynamic dependence analysis for deterministic task-based parallelism," in PPoPP, 2012, pp. 301--302.

Digital Library

[24]

Y. Eom, S. Yang, J. Jenista, and B. Demsky, "DOJ: Dynamically parallelizing object-oriented programs," in PPoPP, 2012.

Digital Library

[25]

K. Yelick et al., "Titanium: A high-performance Java dialect," in Workshop on Java for High-Performance Network Computing, 1998.

[26]

E. D. Berger, B. G. Zorn, and K. S. McKinley, "Reconsidering custom memory allocation," in OOPSLA, 2002, pp. 1--12.

Digital Library

[27]

D. Gay and A. Aiken, "Language support for regions," in PLDI, 2001, pp. 70--80.

Digital Library

[28]

D. Grossman et al., "Formal type soundness for Cyclones region system," Tech. Rep., 2001.

Digital Library

Cited By

Feng GXie JDong DLu Y(2024)UNR: Unified Notifiable RMA Library for HPCProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00111(1-15)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SC41406.2024.00111
S. F. X. Teixeira THenzinger AYadav RAiken AMohror KArnold DBadia R(2023)Automated Mapping of Task-Based Programs onto Distributed and Heterogeneous MachinesProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607079(1-13)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607079
Shiina STaura KMohror KArnold DBadia R(2023)Itoyori: Reconciling Global Address Space and Global Fork-Join Task ParallelismProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607049(1-15)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607049
Show More Cited By

Legion: expressing locality and independence with logical regions

Recommendations

Legion: Expressing locality and independence with logical regions
SC '12: Proceedings of the 2012 International Conference for High Performance Computing, Networking, Storage and Analysis

Modern parallel architectures have both heterogeneous processors and deep, complex memory hierarchies. We present Legion, a programming model and runtime system for achieving high performance on these machines. Legion is organized around logical regions,...
From Legion to Legion-G to OGSI.NET: Object-Based Computing for Grids
IPDPS '03: Proceedings of the 17th International Symposium on Parallel and Distributed Processing

The object abstraction has long proven to be an effective foundation upon which to structure application codes; however, its application to Grid Computing contains many challenges related to the heterogeneous, dynamic, and cross-administrative-domain ...
The Legion support for advanced parameter-space studies on a grid
Grid computing: Towards a new computing infrastructure

Parameter-space (p-space) studies involve running a single application several times with different parameter sets. Since the jobs are mutually independent, many computing resources can be recruited to conduct an entire study in a distributed manner. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

November 2012

1161 pages

ISBN:9781467308045

General Chair:
Jeffrey K. Hollingsworth
University of Maryland

Sponsors

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 10 November 2012

Check for updates

Qualifiers

Research-article

Conference

SC '12

Sponsor:

SIGHPC
SIGARCH
IEEE-CS

SC '12: International Conference for High Performance Computing, Networking, Storage and Analysis

November 10 - 16, 2012

Utah, Salt Lake City

Acceptance Rates

SC '12 Paper Acceptance Rate 100 of 461 submissions, 22%;

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

122
Total Citations
View Citations
533
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)3

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Feng GXie JDong DLu Y(2024)UNR: Unified Notifiable RMA Library for HPCProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00111(1-15)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SC41406.2024.00111
S. F. X. Teixeira THenzinger AYadav RAiken AMohror KArnold DBadia R(2023)Automated Mapping of Task-Based Programs onto Distributed and Heterogeneous MachinesProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607079(1-13)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607079
Shiina STaura KMohror KArnold DBadia R(2023)Itoyori: Reconciling Global Address Space and Global Fork-Join Task ParallelismProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607049(1-15)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607049
Bauer MSlaughter ETreichler SLee WGarland MAiken ADehnavi MKulkarni MKrishnamoorthy S(2023)Visibility Algorithms for Dynamic Dependence Analysis and Distributed CoherenceProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577515(218-231)Online publication date: 25-Feb-2023
https://dl.acm.org/doi/10.1145/3572848.3577515
Caheny PAlvarez LCasas MMoreto MWolf FShende SCulhane CAlam SJagode H(2022)TD-NUCAProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/3571885.3571991(1-15)Online publication date: 13-Nov-2022
https://dl.acm.org/doi/10.5555/3571885.3571991
Ramos EWhite SBhosale AKale L(2022)Runtime Techniques for Automatic Process VirtualizationWorkshop Proceedings of the 51st International Conference on Parallel Processing10.1145/3547276.3548522(1-10)Online publication date: 29-Aug-2022
https://dl.acm.org/doi/10.1145/3547276.3548522
Holmen JSahasrabudhe DBerzins MRobinson T(2022)Porting uintah to heterogeneous systemsProceedings of the Platform for Advanced Scientific Computing Conference10.1145/3539781.3539794(1-10)Online publication date: 27-Jun-2022
https://dl.acm.org/doi/10.1145/3539781.3539794
Yadav RAiken AKjolstad FJhala RDillig I(2022)DISTAL: the distributed tensor algebra compilerProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3519939.3523437(286-300)Online publication date: 9-Jun-2022
https://dl.acm.org/doi/10.1145/3519939.3523437
Awar NJain KRossbach CGligoric M(2021)Programming and execution models for parallel bounded exhaustive testingProceedings of the ACM on Programming Languages10.1145/34855435:OOPSLA(1-28)Online publication date: 15-Oct-2021
https://dl.acm.org/doi/10.1145/3485543
Cartier HDinan JLarkins D(2021)Optimizing Work Stealing Communication with Structured Atomic OperationsProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472522(1-10)Online publication date: 9-Aug-2021
https://dl.acm.org/doi/10.1145/3472456.3472522
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents