Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Skip header Section
The high performance Fortran handbookJanuary 1994
Publisher:
  • MIT Press
  • 55 Hayward St.
  • Cambridge
  • MA
  • United States
ISBN:978-0-262-61094-0
Published:02 January 1994
Pages:
329
Skip Bibliometrics Section
Bibliometrics
Abstract

No abstract available.

Cited By

  1. Nakao M, Murai H, Iwashita H, Boku T and Sato M (2019). Implementation and evaluation of the HPC challenge benchmark in the XcalableMP PGAS language, International Journal of High Performance Computing Applications, 33:1, (110-123), Online publication date: 1-Jan-2019.
  2. ACM
    Nakao M, Murai H, Boku T and Sato M Linkage of XcalableMP and Python languages for high productivity on HPC cluster system Proceedings of Workshops of HPC Asia, (39-47)
  3. ACM
    Kaler T, Hasenplaugh W, Schardl T and Leiserson C (2016). Executing Dynamic Data-Graph Computations Deterministically Using Chromatic Scheduling, ACM Transactions on Parallel Computing, 3:1, (1-31), Online publication date: 28-Jun-2016.
  4. Madarkar J, Chand V, Reddy P and Arora H (2016). A Novel Algorithmic Approach for an Automatic Data Placement for NUMA Based Design, Procedia Computer Science, 78:C, (276-283), Online publication date: 1-Mar-2016.
  5. ACM
    De Wael M, Marr S, De Fraine B, Van Cutsem T and De Meuter W (2015). Partitioned Global Address Space Languages, ACM Computing Surveys, 47:4, (1-27), Online publication date: 21-Jul-2015.
  6. ACM
    Cabezas J, Vilanova L, Gelado I, Jablin T, Navarro N and Hwu W Automatic Parallelization of Kernels in Shared-Memory Multi-GPU Nodes Proceedings of the 29th ACM on International Conference on Supercomputing, (3-13)
  7. Nakao M, Murai H, Shimosaka T, Tabuchi A, Hanawa T, Kodama Y, Bokut T and Sato M XcalableACC Proceedings of the First Workshop on Accelerator Programming using Directives, (27-36)
  8. ACM
    Kaler T, Hasenplaugh W, Schardl T and Leiserson C Executing dynamic data-graph computations deterministically using chromatic scheduling Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures, (154-165)
  9. ACM
    Hiranandani S, Kennedy K, Mellor-Crummey J and Sethi A Compilation techniques for block-cyclic distributions ACM International Conference on Supercomputing 25th Anniversary Volume, (205-216)
  10. ACM
    Stone A and Mills Strout M Abstractions to separate concerns in semi-regular grids Proceedings of the 27th international ACM conference on International conference on supercomputing, (3-12)
  11. Nakao M, Lee J, Boku T and Sato M Productivity and Performance of Global-View Programming with XcalableMP PGAS Language Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), (402-409)
  12. ACM
    Smith A and Kulkarni P Localizing globals and statics to make C programs thread-safe Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems, (205-214)
  13. ACM
    Delorimier M, Kapre N, Mehta N and Dehon A (2011). Spatial hardware implementation for sparse graph algorithms in GraphStep, ACM Transactions on Autonomous and Adaptive Systems (TAAS), 6:3, (1-20), Online publication date: 1-Sep-2011.
  14. ACM
    Hari P, McCabe J, Banafato J, Henry M, Ko K, Koukoumidis E, Kremer U, Martonosi M and Peh L Adaptive spatiotemporal node selection in dynamic networks Proceedings of the 19th international conference on Parallel architectures and compilation techniques, (227-236)
  15. ACM
    Yang L, Yu L, Tang J, Wang L, Zhao J and Li X Enabling multi-core based monitoring and fault tolerance in C++/Java Proceedings of the 3rd International Workshop on Multicore Software Engineering, (32-39)
  16. ACM
    Hong H and Lee H APJava Proceedings of the 2009 International Conference on Hybrid Information Technology, (530-536)
  17. ACM
    Frigo M, Halpern P, Leiserson C and Lewin-Berlin S Reducers and other Cilk++ hyperobjects Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures, (79-90)
  18. Broquedis F, Furmento N, Goglin B, Namyst R and Wacrenier P Dynamic Task and Data Placement over NUMA Architectures Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism, (79-92)
  19. ACM
    Kulkarni M, Carribault P, Pingali K, Ramanarayanan G, Walter B, Bala K and Chew L Scheduling strategies for optimistic parallel execution of irregular programs Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures, (217-228)
  20. ACM
    Zea N, Sartori J and Kumar R (2008). Servo, ACM SIGARCH Computer Architecture News, 36:2, (28-37), Online publication date: 1-May-2008.
  21. ACM
    Qin J and Fahringer T Advanced data flow support for scientific grid workflow applications Proceedings of the 2007 ACM/IEEE conference on Supercomputing, (1-12)
  22. Woo Son S, Chen G, Ozturk O, Kandemir M and Choudhary A (2007). Compiler-Directed Energy Optimization for Parallel Disk Based Systems, IEEE Transactions on Parallel and Distributed Systems, 18:9, (1241-1257), Online publication date: 1-Sep-2007.
  23. Travinin Bliss N and Kepner J (2007). pMatlab Parallel Matlab Library, International Journal of High Performance Computing Applications, 21:3, (336-359), Online publication date: 1-Aug-2007.
  24. ACM
    Huang C and Kale L Charisma Proceedings of the 16th international symposium on High performance distributed computing, (75-84)
  25. Sussman A (2006). Building complex coupled physical simulations on the grid with InterComm, Engineering with Computers, 22:3-4, (311-323), Online publication date: 1-Dec-2006.
  26. ACM
    Chavarría-Miranda D, Nieplocha J and Tipparaju V Topology-aware tile mapping for clusters of SMPs Proceedings of the 3rd conference on Computing frontiers, (383-392)
  27. Kohl J, Wilde T and Bernholdt D (2006). Cumulvs, International Journal of High Performance Computing Applications, 20:2, (255-285), Online publication date: 1-May-2006.
  28. Huang C, Lee C and Kalé L Support for adaptivity in ARMCI using migratable objects Proceedings of the 20th international conference on Parallel and distributed processing, (383-383)
  29. ACM
    Basumallik A and Eigenmann R Towards automatic translation of OpenMP to MPI Proceedings of the 19th annual international conference on Supercomputing, (189-198)
  30. ACM
    Yang S, Butt A, Hu Y and Midkiff S Trust but verify Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, (196-205)
  31. Grelck C (2005). Shared memory multiprocessor support for functional array processing in SAC, Journal of Functional Programming, 15:3, (353-401), Online publication date: 1-May-2005.
  32. Darte A and Huard G (2018). New Complexity Results on Array Contraction and Related Problems, Journal of VLSI Signal Processing Systems, 40:1, (35-55), Online publication date: 1-May-2005.
  33. Thomas A and Olukotun K An Application Analysis Framework For Polymorphic Chip Multiprocessors Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
  34. Lee J and Sussman A High Performance Communication between Parallel Programs Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 4 - Volume 05
  35. Bertrand F, Bramley R, Sussman A, Bernholdt D, Kohl J, Larson J and Damevski K Data Redistribution and Remote Method Invocation in Parallel Component Architectures Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
  36. Rauber T and Runger G A Data-Re-Distribution Library for Multi-Processor Task Programming Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 8 - Volume 09
  37. ACM
    Kandemir M, Chen G, Li F and Demirkiran I Using data replication to reduce communication energy on chip multiprocessors Proceedings of the 2005 Asia and South Pacific Design Automation Conference, (769-772)
  38. ACM
    Ozturk O, Kandemir M, Chen G, Irwin M and Karakoy M Customized on-chip memories for embedded chip multiprocessors Proceedings of the 2005 Asia and South Pacific Design Automation Conference, (743-748)
  39. Dotsenko Y, Coarfa C and Mellor-Crummey J A Multi-Platform Co-Array Fortran Compiler Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, (29-40)
  40. ACM
    Kandemir M, Ozturk O and Karakoy M Dynamic on-chip memory management for chip multiprocessors Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems, (14-23)
  41. Chang R, Chuang T and Lee J (2018). Support and optimization for parallel sparse programs with array intrinsics of Fortran 90, Parallel Computing, 30:4, (527-550), Online publication date: 1-Apr-2004.
  42. Hwang G (2018). An efficient algorithm for communication set generation of data parallel programs with block-cyclic distribution, Parallel Computing, 30:4, (473-501), Online publication date: 1-Apr-2004.
  43. Guo M, Pan Y and Liu Z (2019). Symbolic Communication Set Generation for Irregular Parallel Applications, The Journal of Supercomputing, 25:3, (199-214), Online publication date: 1-Jul-2003.
  44. Díaz M, Rubio B, Soler E and Troya J (2003). Domain interaction patterns to coordinate HPF tasks, Parallel Computing, 29:7, (925-951), Online publication date: 1-Jul-2003.
  45. Hwang G, Chen C, Lee J and Dz-Ching Ju R (2019). Segmented Alignment, The Journal of Supercomputing, 25:1, (17-41), Online publication date: 1-May-2003.
  46. Addison_c C, Ren Y and van Waveren M (2018). OpenMP issues arising in the development of parallel BLAS and LAPACK libraries, Scientific Programming, 11:2, (95-104), Online publication date: 1-Apr-2003.
  47. Dongarra J, Foster I, Fox G, Gropp W, Kennedy K, Torczon L and White A References Sourcebook of parallel computing, (729-789)
  48. Padua D and Hoeflinger J Supercomputers Encyclopedia of Computer Science, (1710-1718)
  49. Quinn M, Miller R, Miller R and Quinn M Parallel processing Encyclopedia of Computer Science, (1349-1365)
  50. Chatterjee S, R. Lebeck A, K. Patnala P and Thottethodi M (2002). Recursive Array Layouts and Fast Matrix Multiplication, IEEE Transactions on Parallel and Distributed Systems, 13:11, (1105-1123), Online publication date: 1-Nov-2002.
  51. Wallcraft A (2019). A Comparison of Co-Array Fortran and OpenMP Fortran for SPMD Programming, The Journal of Supercomputing, 22:3, (231-250), Online publication date: 1-Jul-2002.
  52. Zoppetti G, Agrawal G and Kumar R Compiler and Runtime Support for Irregular Reductions on a Multithreaded Architecture Proceedings of the 16th International Parallel and Distributed Processing Symposium
  53. Grelck C Implementing the NAS Benchmark MG in SAC Proceedings of the 16th International Parallel and Distributed Processing Symposium
  54. Díaz M, Rubio B, Soler E and Troya J (2002). A Border-based Coordination Language for Integrating Task and Data Parallelism, Journal of Parallel and Distributed Computing, 62:4, (715-740), Online publication date: 1-Apr-2002.
  55. Joisha P and Banerjee P (2001). The Efficient Computation of Ownership Sets in HPF, IEEE Transactions on Parallel and Distributed Systems, 12:8, (769-788), Online publication date: 1-Aug-2001.
  56. Das R, Hwang Y, Saltz J and Sussman A Runtime and compiler support for irregular computations Compiler optimizations for scalable parallel systems, (751-778)
  57. Ramanujam J Integer lattice based methods for local address generation for block-cyclic distributions Compiler optimizations for scalable parallel systems, (597-645)
  58. Adve V and Mellor-Crummey J Advanced code generation for high performance Fortran Compiler optimizations for scalable parallel systems, (553-596)
  59. Palermo D, Hodges E and Banerjee P Compiler optimization of dynamic data distributions for distributed-memory multicomputers Compiler optimizations for scalable parallel systems, (445-484)
  60. Kennedy K and Koelbel C High performance Fortran 2.0 Compiler optimizations for scalable parallel systems, (3-43)
  61. Garcia J, Ayguadé E and Labarta J (2001). A Framework for Integrating Data Alignment, Distribution, and Redistribution in Distributed Memory Multiprocessors, IEEE Transactions on Parallel and Distributed Systems, 12:4, (416-431), Online publication date: 1-Apr-2001.
  62. ACM
    Diaz M, Rubio B, Soler E and Troya J DIP Proceedings of the 2001 ACM symposium on Applied computing, (148-150)
  63. Chang R, Chuang T and Lee J (2019). Parallel Sparse Supports for Array Intrinsic Functions of Fortran 90, The Journal of Supercomputing, 18:3, (305-339), Online publication date: 1-Mar-2001.
  64. Kandemir M, Choudhary A, Banerjee P, Ramanujam J and Shenoy N (2000). Minimizing Data and Synchronization Costs in One-Way Communication, IEEE Transactions on Parallel and Distributed Systems, 11:12, (1232-1251), Online publication date: 1-Dec-2000.
  65. Shih K, Sheu J and Chang C (2019). Efficient Address Generation for Affine Subscripts in Data-Parallel Programs, The Journal of Supercomputing, 17:2, (205-227), Online publication date: 1-Sep-2000.
  66. Bräunl T (2000). Parallaxis-III, IEEE Transactions on Software Engineering, 26:3, (227-243), Online publication date: 1-Mar-2000.
  67. Lain A, Chakrabarti D and Banerjee P (2000). Compiler and Run-Time Support for Exploiting Regularity within Irregular Applications, IEEE Transactions on Parallel and Distributed Systems, 11:2, (119-135), Online publication date: 1-Feb-2000.
  68. Park N, Prasanna V and Raghavendra C (1999). Efficient Algorithms for Block-Cyclic Array Redistribution Between Processor Sets, IEEE Transactions on Parallel and Distributed Systems, 10:12, (1217-1240), Online publication date: 1-Dec-1999.
  69. Petitet A and Dongarra J (1999). Algorithmic Redistribution Methods for Block-Cyclic Decompositions, IEEE Transactions on Parallel and Distributed Systems, 10:12, (1201-1216), Online publication date: 1-Dec-1999.
  70. ACM
    Kandemir M, Banerjee P, Choudhary A, Ramanujam J and Shenoy N (1999). A global communication optimization technique based on data-flow analysis and linear algebra, ACM Transactions on Programming Languages and Systems (TOPLAS), 21:6, (1251-1297), Online publication date: 1-Nov-1999.
  71. ACM
    Scherer A, Lu H, Gross T and Zwaenepoel W (1999). Transparent adaptive parallelism on NOWs using OpenMP, ACM SIGPLAN Notices, 34:8, (96-106), Online publication date: 1-Aug-1999.
  72. ACM
    McCurdy C and Mellor-Crummey J (1999). An evaluation of computing paradigms for N-body simulations on distributed memory architectures, ACM SIGPLAN Notices, 34:8, (25-36), Online publication date: 1-Aug-1999.
  73. ACM
    Chatterjee S, Jain V, Lebeck A, Mundhra S and Thottethodi M Nonlinear array layouts for hierarchical memory systems Proceedings of the 13th international conference on Supercomputing, (444-453)
  74. ACM
    Chatterjee S, Lebeck A, Patnala P and Thottethodi M Recursive array layouts and fast parallel matrix multiplication Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures, (222-231)
  75. ACM
    Mache J, Lo V, Livingston M and Garg S The impact of spatial layout of jobs on parallel I/O performance Proceedings of the sixth workshop on I/O in parallel and distributed systems, (45-56)
  76. ACM
    Peyton Jones S, Reid A, Henderson F, Hoare T and Marlow S (1999). A semantics for imprecise exceptions, ACM SIGPLAN Notices, 34:5, (25-36), Online publication date: 1-May-1999.
  77. ACM
    Peyton Jones S, Reid A, Henderson F, Hoare T and Marlow S A semantics for imprecise exceptions Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, (25-36)
  78. ACM
    Scherer A, Lu H, Gross T and Zwaenepoel W Transparent adaptive parallelism on NOWs using OpenMP Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, (96-106)
  79. ACM
    McCurdy C and Mellor-Crummey J An evaluation of computing paradigms for N-body simulations on distributed memory architectures Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, (25-36)
  80. MacBeth M, McGuigan K and Hatcher P Executing Java threads in parallel in a distributed-memory environment Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
  81. Adve V, Jin G, Mellor-Crummey J and Yi Q High performance Fortran compilation techniques for parallelizing scientific codes Proceedings of the 1998 ACM/IEEE conference on Supercomputing, (1-23)
  82. Park N, Prasanna V and Raghavendra C Efficient algorithms for block-cyclic array redistribution between processor sets Proceedings of the 1998 ACM/IEEE conference on Supercomputing, (1-13)
  83. Abdelrahman T and Wong T (1998). Compiler Support for Array Distribution onNUMA Shared Memory Multiprocessors, The Journal of Supercomputing, 12:4, (349-371), Online publication date: 1-Oct-1998.
  84. ACM
    Kohl J and Papadopoulas P Efficient and flexible fault tolerance and migration of scientific simulations using CUMULVS Proceedings of the SIGMETRICS symposium on Parallel and distributed tools, (60-71)
  85. ACM
    Müller M, Warschko T and Tichy W Prefetching on the Cray-T3E Proceedings of the 12th international conference on Supercomputing, (361-368)
  86. ACM
    Coddington P and Ko S Techniques for empirical testing of parallel random number generators Proceedings of the 12th international conference on Supercomputing, (282-288)
  87. ACM
    Kennedy K and Kremer U (1998). Automatic data layout for distributed-memory machines, ACM Transactions on Programming Languages and Systems (TOPLAS), 20:4, (869-916), Online publication date: 1-Jul-1998.
  88. Agrawal G (1998). Interprocedural Partial Redundancy Elimination With Application to Distributed Memory Compilation, IEEE Transactions on Parallel and Distributed Systems, 9:7, (609-625), Online publication date: 1-Jul-1998.
  89. ACM
    Adve V and Mellor-Crummey J (1998). Using integer sets for data-parallel program analysis and optimization, ACM SIGPLAN Notices, 33:5, (186-198), Online publication date: 1-May-1998.
  90. ACM
    Adve V and Mellor-Crummey J Using integer sets for data-parallel program analysis and optimization Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, (186-198)
  91. Moreira J and Midkiff S (1998). Fortran 90 in CSE, IEEE Computational Science & Engineering, 5:2, (39-49), Online publication date: 1-Apr-1998.
  92. Desprez F, Randriamaro C, Dongarra J, Petitet A and Robert Y (1998). Scheduling Block-Cyclic Array Redistribution, IEEE Transactions on Parallel and Distributed Systems, 9:2, (192-205), Online publication date: 1-Feb-1998.
  93. ACM
    Kandemir M, Choudhary A, Ramanujam J and Kandaswamy M A unified compiler algorithm for optimizing locality, parallelism and communication in out-of-core computations Proceedings of the fifth workshop on I/O in parallel and distributed systems, (79-92)
  94. ACM
    Nucciarone J, Özyörük Y and Long L New life in dusty decks Proceedings of the 1997 ACM/IEEE conference on Supercomputing, (1-19)
  95. ACM
    Kennedy K, Bender C, Connolly J, Hennessy J, Vernon M and Smarr L (1997). A nationwide parallel computing environment, Communications of the ACM, 40:11, (62-72), Online publication date: 1-Nov-1997.
  96. Ramaswamy S, Sapatnekar S and Banerjee P (1997). A Framework for Exploiting Task and Data Parallelism on Distributed Memory Multicomputers, IEEE Transactions on Parallel and Distributed Systems, 8:11, (1098-1116), Online publication date: 1-Nov-1997.
  97. Nieplocha J and Harrison R (2019). Shared Memory Programming in Metacomputing Environments, The Journal of Supercomputing, 11:2, (119-136), Online publication date: 1-Oct-1997.
  98. Geist G, Kohl J and Papadopoulos P (1997). Cumulvs, International Journal of High Performance Computing Applications, 11:3, (224-235), Online publication date: 1-Sep-1997.
  99. Lim Y, Park N and Prasanna V Efficient Algorithms for Multi-dimensional Block-Cyclic Redistribution of Arrays Proceedings of the international Conference on Parallel Processing, (234-241)
  100. Lee J, Ho D and Chuang Y Data Distribution Analysis and Optimization for Pointer-Based Distributed Programs Proceedings of the international Conference on Parallel Processing, (56-63)
  101. Tandri S and Abdelrahman T Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors Proceedings of the international Conference on Parallel Processing, (64-73)
  102. Lee P (1997). Efficient Algorithms for Data Distribution on Distributed Memory Parallel Computers, IEEE Transactions on Parallel and Distributed Systems, 8:8, (825-839), Online publication date: 1-Aug-1997.
  103. Ancourt C, Barthou D, Guettier C, Irigoin F, Jeannet B, Jourdan J and Mattioli J Automatic data mapping of signal processing applications Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
  104. ACM
    Coelho F (1997). Compiling dynamic mappings with array copies, ACM SIGPLAN Notices, 32:7, (168-179), Online publication date: 1-Jul-1997.
  105. ACM
    Subhlok J and Yang B (1997). A new model for integrated nested task and data parallel programming, ACM SIGPLAN Notices, 32:7, (1-12), Online publication date: 1-Jul-1997.
  106. ACM
    Coelho F Compiling dynamic mappings with array copies Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming, (168-179)
  107. ACM
    Subhlok J and Yang B A new model for integrated nested task and data parallel programming Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming, (1-12)
  108. Moreira J and Naik V (2019). Dynamic resource management on distributed systems using reconfigurable applications, IBM Journal of Research and Development, 41:3, (303-330), Online publication date: 1-May-1997.
  109. ACM
    Chandra R, Chen D, Cox R, Maydan D, Nedeljkovic N and Anderson J (1997). Data distribution support on distributed shared memory multiprocessors, ACM SIGPLAN Notices, 32:5, (334-345), Online publication date: 1-May-1997.
  110. ACM
    Chandra R, Chen D, Cox R, Maydan D, Nedeljkovic N and Anderson J Data distribution support on distributed shared memory multiprocessors Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, (334-345)
  111. Koo M, Park S, Yook H and Park M A transformation method to reduce loop overhead in HPF compiler Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
  112. Mak K and Chan R Parallel Implementation of 2-Dimensional Toeplitz Solver on MasPar with Applications to Image Restoration Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
  113. Melin E, Raffin B, Rebeuf X and Virot B SCL-chan Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
  114. Kandemir M, Bordawekar R and Choudhary A Data Access Reorganizations in Compiling Out-of-Core Data Parallel Programs on Distributed Memory Machines Proceedings of the 11th International Symposium on Parallel Processing
  115. Edjlali G, Sussman A and Saltz J Interoperability of Data Parallel Runtime Libraries Proceedings of the 11th International Symposium on Parallel Processing, (451-459)
  116. Keleher P and Tseng C Enhancing Software DSM for Compiler-Parallelized Applications Proceedings of the 11th International Symposium on Parallel Processing, (490-499)
  117. Gupta M On Privatization of Variables for Data-Parallel Execution Proceedings of the 11th International Symposium on Parallel Processing, (533-541)
  118. Koo M, Park S, Yook H and Park M A New Transformation Method to Generate Optimized DO Loop from FORALL Construct Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
  119. Gillett R and Kaufmann R (1997). Using the Memory Channel Network, IEEE Micro, 17:1, (19-25), Online publication date: 1-Jan-1997.
  120. Akarsu E, Dincer K, Haupt T and Fox G Particle-in-cell simulation codes in High Performance Fortran Proceedings of the 1996 ACM/IEEE conference on Supercomputing, (38-es)
  121. Foster I, Kohr D, Krishnaiyer R and Choudhary A Double standards Proceedings of the 1996 ACM/IEEE conference on Supercomputing, (36-es)
  122. Blackford L, Choi J, Cleary A, Petitet A, Whaley R, Demmel J, Dhillon I, Stanley K, Dongarra J, Hammarling S, Henry G and Walker D ScaLAPACK Proceedings of the 1996 ACM/IEEE conference on Supercomputing, (5-es)
  123. Kaushik S, Huang C and Sadayappan P (2019). Efficient Index Set Generation for Compiling HPF Array Statements on Distributed-Memory Machines, Journal of Parallel and Distributed Computing, 38:2, (237-247), Online publication date: 1-Nov-1996.
  124. Hall M, Hiranandani S, Kennedy K and Tseng C (2019). Interprocedural Compilation of Fortran D, Journal of Parallel and Distributed Computing, 38:2, (114-129), Online publication date: 1-Nov-1996.
  125. Wilson G and Bal H (1996). Using the Cowichan Problems to Assess the Usability of Orca, IEEE Parallel & Distributed Technology: Systems & Technology, 4:3, (36-44), Online publication date: 1-Sep-1996.
  126. Goldstein S, Schauser K and Culler D (2019). Lazy Threads, Journal of Parallel and Distributed Computing, 37:1, (5-20), Online publication date: 25-Aug-1996.
  127. Dincer K, Fox G and Hawick K High Performance Fortran and Possible Extensions to Support Conjugate Gradient Algorithms Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
  128. Fang N Engineering Parallel Algorithms Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
  129. ACM
    Dongarra J, Otto S, Snir M and Walker D (1996). A message passing standard for MPP and workstations, Communications of the ACM, 39:7, (84-90), Online publication date: 1-Jul-1996.
  130. Thakur R, Choudhary A and Ramanujam J (1996). Efficient Algorithms for Array Redistribution, IEEE Transactions on Parallel and Distributed Systems, 7:6, (587-594), Online publication date: 1-Jun-1996.
  131. Toledo S Performance Prediction with Benchmaps Proceedings of the 10th International Parallel Processing Symposium, (479-485)
  132. Saini S NAS Experiences of Porting CM Fortran Codes to on IBM SP2 and SGI Power Challenge Proceedings of the 10th International Parallel Processing Symposium, (878-880)
  133. McMahon J and Teitelbaum K Space-Time Adaptive Processing on the Mesh Synchronous Processor Proceedings of the 10th International Parallel Processing Symposium, (734-740)
  134. Bae S and Ranka S PACK/UNPACK on Coarse-Grained Distributed Memory Parallel Machines Proceedings of the 10th International Parallel Processing Symposium, (320-324)
  135. Kaufmann R and Reddin T Digital's clusters and scientific parallel applications Proceedings of the 41st IEEE International Computer Conference
  136. Amza C, Cox A, Dwarkadas S, Keleher P, Lu H, Rajamony R, Yu W and Zwaenepoel W (1996). TreadMarks, Computer, 29:2, (18-28), Online publication date: 1-Feb-1996.
  137. ACM
    Hood R The p2d2 project Proceedings of the SIGMETRICS symposium on Parallel and distributed tools, (127-136)
  138. ACM
    Acharya A Eliminating redundant barrier synchronizations in rule-based programs Proceedings of the 10th international conference on Supercomputing, (325-332)
  139. ACM
    Ranganathan M, Acharya A, Edjlali G, Sussman A and Saltz J Runtime coupling of data-parallel programs Proceedings of the 10th international conference on Supercomputing, (229-236)
  140. ACM
    Krishnan S and Kale L Automating parallel runtime optimizations using post-mortem analysis Proceedings of the 10th international conference on Supercomputing, (221-228)
  141. ACM
    Lain A and Banerjee P Compiler support for hybrid irregular accesses on multicomputers Proceedings of the 10th international conference on Supercomputing, (1-9)
  142. ACM
    Gupta M, Midkiff S, Schonberg E, Seshadri V, Shields D, Wang K, Ching W and Ngo T An HPF compiler for the IBM SP2 Proceedings of the 1995 ACM/IEEE conference on Supercomputing, (71-es)
  143. ACM
    Adve V, Mellor-Crummey J, Anderson M, Wang J, Reed D and Kennedy K An integrated compilation and performance analysis environment for data parallel programs Proceedings of the 1995 ACM/IEEE conference on Supercomputing, (50-es)
  144. ACM
    Agrawal G and Saltz J Interprocedural compilation of irregular applications for distributed memory machines Proceedings of the 1995 ACM/IEEE conference on Supercomputing, (48-es)
  145. Banerjee P, Chandy J, Gupta M, Hodges IV E, Holm J, Lain A, Palermo D, Ramaswamy S and Su E (1995). The Paradigm Compiler for Distributed-Memory Multicomputers, Computer, 28:10, (37-47), Online publication date: 1-Oct-1995.
  146. ACM
    Kennedy K, Nedeljkovic N and Sethi A (1995). A linear-time algorithm for computing the memory access sequence in data-parallel programs, ACM SIGPLAN Notices, 30:8, (102-111), Online publication date: 1-Aug-1995.
  147. ACM
    Kennedy K, Nedeljkovic N and Sethi A A linear-time algorithm for computing the memory access sequence in data-parallel programs Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, (102-111)
  148. ACM
    Su E, Lain A, Ramaswamy S, Palermo D, Hodges E and Banerjee P Advanced compilation techniques in the PARADIGM compiler for distributed-memory multicomputers Proceedings of the 9th international conference on Supercomputing, (424-433)
  149. ACM
    Bergmark D Optimization and parallelization of a commodity trade model for the IBM SP1/2, using parallel programming tools Proceedings of the 9th international conference on Supercomputing, (227-236)
  150. ACM
    Kennedy K, Nedeljkovic N and Sethi A Efficient address generation for block-cyclic distributions Proceedings of the 9th international conference on Supercomputing, (180-184)
  151. ACM
    Müller A and Rühl R Extending high performance Fortran for the support of unstructured computations Proceedings of the 9th international conference on Supercomputing, (127-136)
  152. Agrawal G, Sussman A and Saltz J (1995). An Integrated Runtime and Compile-Time Approach for Parallelizing Structured and Block Structured Applications, IEEE Transactions on Parallel and Distributed Systems, 6:7, (747-754), Online publication date: 1-Jul-1995.
  153. Das R, Wu J, Saltz J, Berryman H and Hiranandani S (1995). Distributed Memory Compiler Design For Sparse Problems, IEEE Transactions on Computers, 44:6, (737-753), Online publication date: 1-Jun-1995.
  154. Merlin J and Hey A (2018). An Introduction to High Performance Fortran, Scientific Programming, 4:2, (87-113), Online publication date: 1-Apr-1995.
  155. Chatterjee S, Gilbert J, Long F, Schreiber R and Teng S (2019). Generating Local Addresses and Communication Sets for Data-Parallel Programs, Journal of Parallel and Distributed Computing, 26:1, (72-84), Online publication date: 1-Apr-1995.
  156. Ponnusamy R, Hwang Y, Das R, Saltz J, Choudhary A and Fox G (1995). Supporting Irregular Distributions Using Data-Parallel Languages, IEEE Parallel & Distributed Technology: Systems & Technology, 3:1, (12-24), Online publication date: 1-Mar-1995.
  157. Cheng D and Hood R A portable debugger for parallel and distributed programs Proceedings of the 1994 ACM/IEEE conference on Supercomputing, (723-732)
  158. Sharma S, Ponnusamy R, Moon B, Hwang Y, Das R and Saltz J Run-time and compile-time support for adaptive irregular problems Proceedings of the 1994 ACM/IEEE conference on Supercomputing, (97-106)
  159. Ching W and Katz A An experimental APL compiler for a distributed memory parallel machine Proceedings of the 1994 ACM/IEEE conference on Supercomputing, (59-68)
  160. Adve V, Carle A, Granston E, Hiranandani S, Kennedy K, Koelbel C, Kremer U, Mellor-Crummey J, Warren S and Tseng C (1994). Requirements for Data-Parallel Programming Environments, IEEE Parallel & Distributed Technology: Systems & Technology, 2:3, (48-58), Online publication date: 1-Sep-1994.
  161. Foster I (1994). Task Parallelism and High-Performance Languages, IEEE Parallel & Distributed Technology: Systems & Technology, 2:3, (27-36), Online publication date: 1-Sep-1994.
  162. Bixby R, Kennedy K and Kremer U Automatic Data Layout Using 0-1 Integer Programming Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques, (111-122)
  163. ACM
    von Hanxleden R and Kennedy K GIVE-N-TAKE—a balanced code placement framework Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation, (107-120)
  164. ACM
    Hiranandani S, Kennedy K, Mellor-Crummey J and Sethi A Compilation techniques for block-cyclic distributions Proceedings of the 8th international conference on Supercomputing, (392-403)
  165. ACM
    von Hanxleden R and Kennedy K (2019). GIVE-N-TAKE—a balanced code placement framework, ACM SIGPLAN Notices, 29:6, (107-120), Online publication date: 1-Jun-1994.
  166. ACM
    Choudhary A, Koelbel C and Zosel M High performance Fortran Proceedings of the 1993 ACM/IEEE conference on Supercomputing, (610-613)
Contributors
  • Rice University
  • Hewlett-Packard Inc.
  • Intel Corporation
  • Oracle Corporation
  • Lawrence Livermore National Laboratory

Reviews

Festus Gail Gray

The High Performance Fortran (HPF) language specifications developed by the High Performance Fortran Forum provide a portable extension to Fortran 90 for writing data parallel applications. This handbook functions as a user's guide to HPF. It provides clear explanations of both the purpose and the function of the new constructs, accompanied by numerous examples to illustrate both basic concepts and subtle effects. The handbook includes a review of new features of Fortran 90, concentrating on those that have an impact on HPF. Other chapters describe features for mapping data to parallel processors, specifying data parallel operations, and interfacing HPF programs to other programming languages. The book serves its purpose well, and I recommend it to those who want an understanding of the features of HPF. Well-chosen examples illustrate performance tradeoffs that result from various methods of data mapping and the complex relationships among data partitions and communications requirements. Many useful suggestions accompany the descriptions of each feature. The value of the whole HPF effort is questionable. To quote the authors, “Fortran was originally developed for serial machines with linear memory architectures.” This fact will effectively prevent FORTRAN from ever becoming a viable language for parallel machines and in fact from ever becoming a modern procedural language.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Recommendations