Abstract
No abstract available.
Cited By
- Nakao M, Murai H, Iwashita H, Boku T and Sato M (2019). Implementation and evaluation of the HPC challenge benchmark in the XcalableMP PGAS language, International Journal of High Performance Computing Applications, 33:1, (110-123), Online publication date: 1-Jan-2019.
- Nakao M, Murai H, Boku T and Sato M Linkage of XcalableMP and Python languages for high productivity on HPC cluster system Proceedings of Workshops of HPC Asia, (39-47)
- Kaler T, Hasenplaugh W, Schardl T and Leiserson C (2016). Executing Dynamic Data-Graph Computations Deterministically Using Chromatic Scheduling, ACM Transactions on Parallel Computing, 3:1, (1-31), Online publication date: 28-Jun-2016.
- Madarkar J, Chand V, Reddy P and Arora H (2016). A Novel Algorithmic Approach for an Automatic Data Placement for NUMA Based Design, Procedia Computer Science, 78:C, (276-283), Online publication date: 1-Mar-2016.
- De Wael M, Marr S, De Fraine B, Van Cutsem T and De Meuter W (2015). Partitioned Global Address Space Languages, ACM Computing Surveys, 47:4, (1-27), Online publication date: 21-Jul-2015.
- Cabezas J, Vilanova L, Gelado I, Jablin T, Navarro N and Hwu W Automatic Parallelization of Kernels in Shared-Memory Multi-GPU Nodes Proceedings of the 29th ACM on International Conference on Supercomputing, (3-13)
- Nakao M, Murai H, Shimosaka T, Tabuchi A, Hanawa T, Kodama Y, Bokut T and Sato M XcalableACC Proceedings of the First Workshop on Accelerator Programming using Directives, (27-36)
- Kaler T, Hasenplaugh W, Schardl T and Leiserson C Executing dynamic data-graph computations deterministically using chromatic scheduling Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures, (154-165)
- Hiranandani S, Kennedy K, Mellor-Crummey J and Sethi A Compilation techniques for block-cyclic distributions ACM International Conference on Supercomputing 25th Anniversary Volume, (205-216)
- Stone A and Mills Strout M Abstractions to separate concerns in semi-regular grids Proceedings of the 27th international ACM conference on International conference on supercomputing, (3-12)
- Nakao M, Lee J, Boku T and Sato M Productivity and Performance of Global-View Programming with XcalableMP PGAS Language Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), (402-409)
- Smith A and Kulkarni P Localizing globals and statics to make C programs thread-safe Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems, (205-214)
- Delorimier M, Kapre N, Mehta N and Dehon A (2011). Spatial hardware implementation for sparse graph algorithms in GraphStep, ACM Transactions on Autonomous and Adaptive Systems (TAAS), 6:3, (1-20), Online publication date: 1-Sep-2011.
- Hari P, McCabe J, Banafato J, Henry M, Ko K, Koukoumidis E, Kremer U, Martonosi M and Peh L Adaptive spatiotemporal node selection in dynamic networks Proceedings of the 19th international conference on Parallel architectures and compilation techniques, (227-236)
- Yang L, Yu L, Tang J, Wang L, Zhao J and Li X Enabling multi-core based monitoring and fault tolerance in C++/Java Proceedings of the 3rd International Workshop on Multicore Software Engineering, (32-39)
- Hong H and Lee H APJava Proceedings of the 2009 International Conference on Hybrid Information Technology, (530-536)
- Frigo M, Halpern P, Leiserson C and Lewin-Berlin S Reducers and other Cilk++ hyperobjects Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures, (79-90)
- Broquedis F, Furmento N, Goglin B, Namyst R and Wacrenier P Dynamic Task and Data Placement over NUMA Architectures Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism, (79-92)
- Kulkarni M, Carribault P, Pingali K, Ramanarayanan G, Walter B, Bala K and Chew L Scheduling strategies for optimistic parallel execution of irregular programs Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures, (217-228)
- Zea N, Sartori J and Kumar R (2008). Servo, ACM SIGARCH Computer Architecture News, 36:2, (28-37), Online publication date: 1-May-2008.
- Qin J and Fahringer T Advanced data flow support for scientific grid workflow applications Proceedings of the 2007 ACM/IEEE conference on Supercomputing, (1-12)
- Woo Son S, Chen G, Ozturk O, Kandemir M and Choudhary A (2007). Compiler-Directed Energy Optimization for Parallel Disk Based Systems, IEEE Transactions on Parallel and Distributed Systems, 18:9, (1241-1257), Online publication date: 1-Sep-2007.
- Travinin Bliss N and Kepner J (2007). pMatlab Parallel Matlab Library, International Journal of High Performance Computing Applications, 21:3, (336-359), Online publication date: 1-Aug-2007.
- Huang C and Kale L Charisma Proceedings of the 16th international symposium on High performance distributed computing, (75-84)
- Sussman A (2006). Building complex coupled physical simulations on the grid with InterComm, Engineering with Computers, 22:3-4, (311-323), Online publication date: 1-Dec-2006.
- Chavarría-Miranda D, Nieplocha J and Tipparaju V Topology-aware tile mapping for clusters of SMPs Proceedings of the 3rd conference on Computing frontiers, (383-392)
- Kohl J, Wilde T and Bernholdt D (2006). Cumulvs, International Journal of High Performance Computing Applications, 20:2, (255-285), Online publication date: 1-May-2006.
- Huang C, Lee C and Kalé L Support for adaptivity in ARMCI using migratable objects Proceedings of the 20th international conference on Parallel and distributed processing, (383-383)
- Basumallik A and Eigenmann R Towards automatic translation of OpenMP to MPI Proceedings of the 19th annual international conference on Supercomputing, (189-198)
- Yang S, Butt A, Hu Y and Midkiff S Trust but verify Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, (196-205)
- Grelck C (2005). Shared memory multiprocessor support for functional array processing in SAC, Journal of Functional Programming, 15:3, (353-401), Online publication date: 1-May-2005.
- Darte A and Huard G (2018). New Complexity Results on Array Contraction and Related Problems, Journal of VLSI Signal Processing Systems, 40:1, (35-55), Online publication date: 1-May-2005.
- Thomas A and Olukotun K An Application Analysis Framework For Polymorphic Chip Multiprocessors Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
- Lee J and Sussman A High Performance Communication between Parallel Programs Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 4 - Volume 05
- Bertrand F, Bramley R, Sussman A, Bernholdt D, Kohl J, Larson J and Damevski K Data Redistribution and Remote Method Invocation in Parallel Component Architectures Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
- Rauber T and Runger G A Data-Re-Distribution Library for Multi-Processor Task Programming Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 8 - Volume 09
- Kandemir M, Chen G, Li F and Demirkiran I Using data replication to reduce communication energy on chip multiprocessors Proceedings of the 2005 Asia and South Pacific Design Automation Conference, (769-772)
- Ozturk O, Kandemir M, Chen G, Irwin M and Karakoy M Customized on-chip memories for embedded chip multiprocessors Proceedings of the 2005 Asia and South Pacific Design Automation Conference, (743-748)
- Dotsenko Y, Coarfa C and Mellor-Crummey J A Multi-Platform Co-Array Fortran Compiler Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, (29-40)
- Kandemir M, Ozturk O and Karakoy M Dynamic on-chip memory management for chip multiprocessors Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems, (14-23)
- Chang R, Chuang T and Lee J (2018). Support and optimization for parallel sparse programs with array intrinsics of Fortran 90, Parallel Computing, 30:4, (527-550), Online publication date: 1-Apr-2004.
- Hwang G (2018). An efficient algorithm for communication set generation of data parallel programs with block-cyclic distribution, Parallel Computing, 30:4, (473-501), Online publication date: 1-Apr-2004.
- Guo M, Pan Y and Liu Z (2019). Symbolic Communication Set Generation for Irregular Parallel Applications, The Journal of Supercomputing, 25:3, (199-214), Online publication date: 1-Jul-2003.
- Díaz M, Rubio B, Soler E and Troya J (2003). Domain interaction patterns to coordinate HPF tasks, Parallel Computing, 29:7, (925-951), Online publication date: 1-Jul-2003.
- Hwang G, Chen C, Lee J and Dz-Ching Ju R (2019). Segmented Alignment, The Journal of Supercomputing, 25:1, (17-41), Online publication date: 1-May-2003.
- Addison_c C, Ren Y and van Waveren M (2018). OpenMP issues arising in the development of parallel BLAS and LAPACK libraries, Scientific Programming, 11:2, (95-104), Online publication date: 1-Apr-2003.
- Dongarra J, Foster I, Fox G, Gropp W, Kennedy K, Torczon L and White A References Sourcebook of parallel computing, (729-789)
- Padua D and Hoeflinger J Supercomputers Encyclopedia of Computer Science, (1710-1718)
- Quinn M, Miller R, Miller R and Quinn M Parallel processing Encyclopedia of Computer Science, (1349-1365)
- Chatterjee S, R. Lebeck A, K. Patnala P and Thottethodi M (2002). Recursive Array Layouts and Fast Matrix Multiplication, IEEE Transactions on Parallel and Distributed Systems, 13:11, (1105-1123), Online publication date: 1-Nov-2002.
- Wallcraft A (2019). A Comparison of Co-Array Fortran and OpenMP Fortran for SPMD Programming, The Journal of Supercomputing, 22:3, (231-250), Online publication date: 1-Jul-2002.
- Zoppetti G, Agrawal G and Kumar R Compiler and Runtime Support for Irregular Reductions on a Multithreaded Architecture Proceedings of the 16th International Parallel and Distributed Processing Symposium
- Grelck C Implementing the NAS Benchmark MG in SAC Proceedings of the 16th International Parallel and Distributed Processing Symposium
- Díaz M, Rubio B, Soler E and Troya J (2002). A Border-based Coordination Language for Integrating Task and Data Parallelism, Journal of Parallel and Distributed Computing, 62:4, (715-740), Online publication date: 1-Apr-2002.
- Joisha P and Banerjee P (2001). The Efficient Computation of Ownership Sets in HPF, IEEE Transactions on Parallel and Distributed Systems, 12:8, (769-788), Online publication date: 1-Aug-2001.
- Das R, Hwang Y, Saltz J and Sussman A Runtime and compiler support for irregular computations Compiler optimizations for scalable parallel systems, (751-778)
- Ramanujam J Integer lattice based methods for local address generation for block-cyclic distributions Compiler optimizations for scalable parallel systems, (597-645)
- Adve V and Mellor-Crummey J Advanced code generation for high performance Fortran Compiler optimizations for scalable parallel systems, (553-596)
- Palermo D, Hodges E and Banerjee P Compiler optimization of dynamic data distributions for distributed-memory multicomputers Compiler optimizations for scalable parallel systems, (445-484)
- Kennedy K and Koelbel C High performance Fortran 2.0 Compiler optimizations for scalable parallel systems, (3-43)
- Garcia J, Ayguadé E and Labarta J (2001). A Framework for Integrating Data Alignment, Distribution, and Redistribution in Distributed Memory Multiprocessors, IEEE Transactions on Parallel and Distributed Systems, 12:4, (416-431), Online publication date: 1-Apr-2001.
- Diaz M, Rubio B, Soler E and Troya J DIP Proceedings of the 2001 ACM symposium on Applied computing, (148-150)
- Chang R, Chuang T and Lee J (2019). Parallel Sparse Supports for Array Intrinsic Functions of Fortran 90, The Journal of Supercomputing, 18:3, (305-339), Online publication date: 1-Mar-2001.
- Kandemir M, Choudhary A, Banerjee P, Ramanujam J and Shenoy N (2000). Minimizing Data and Synchronization Costs in One-Way Communication, IEEE Transactions on Parallel and Distributed Systems, 11:12, (1232-1251), Online publication date: 1-Dec-2000.
- Shih K, Sheu J and Chang C (2019). Efficient Address Generation for Affine Subscripts in Data-Parallel Programs, The Journal of Supercomputing, 17:2, (205-227), Online publication date: 1-Sep-2000.
- Bräunl T (2000). Parallaxis-III, IEEE Transactions on Software Engineering, 26:3, (227-243), Online publication date: 1-Mar-2000.
- Lain A, Chakrabarti D and Banerjee P (2000). Compiler and Run-Time Support for Exploiting Regularity within Irregular Applications, IEEE Transactions on Parallel and Distributed Systems, 11:2, (119-135), Online publication date: 1-Feb-2000.
- Park N, Prasanna V and Raghavendra C (1999). Efficient Algorithms for Block-Cyclic Array Redistribution Between Processor Sets, IEEE Transactions on Parallel and Distributed Systems, 10:12, (1217-1240), Online publication date: 1-Dec-1999.
- Petitet A and Dongarra J (1999). Algorithmic Redistribution Methods for Block-Cyclic Decompositions, IEEE Transactions on Parallel and Distributed Systems, 10:12, (1201-1216), Online publication date: 1-Dec-1999.
- Kandemir M, Banerjee P, Choudhary A, Ramanujam J and Shenoy N (1999). A global communication optimization technique based on data-flow analysis and linear algebra, ACM Transactions on Programming Languages and Systems (TOPLAS), 21:6, (1251-1297), Online publication date: 1-Nov-1999.
- Scherer A, Lu H, Gross T and Zwaenepoel W (1999). Transparent adaptive parallelism on NOWs using OpenMP, ACM SIGPLAN Notices, 34:8, (96-106), Online publication date: 1-Aug-1999.
- McCurdy C and Mellor-Crummey J (1999). An evaluation of computing paradigms for N-body simulations on distributed memory architectures, ACM SIGPLAN Notices, 34:8, (25-36), Online publication date: 1-Aug-1999.
- Chatterjee S, Jain V, Lebeck A, Mundhra S and Thottethodi M Nonlinear array layouts for hierarchical memory systems Proceedings of the 13th international conference on Supercomputing, (444-453)
- Chatterjee S, Lebeck A, Patnala P and Thottethodi M Recursive array layouts and fast parallel matrix multiplication Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures, (222-231)
- Mache J, Lo V, Livingston M and Garg S The impact of spatial layout of jobs on parallel I/O performance Proceedings of the sixth workshop on I/O in parallel and distributed systems, (45-56)
- Peyton Jones S, Reid A, Henderson F, Hoare T and Marlow S (1999). A semantics for imprecise exceptions, ACM SIGPLAN Notices, 34:5, (25-36), Online publication date: 1-May-1999.
- Peyton Jones S, Reid A, Henderson F, Hoare T and Marlow S A semantics for imprecise exceptions Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, (25-36)
- Scherer A, Lu H, Gross T and Zwaenepoel W Transparent adaptive parallelism on NOWs using OpenMP Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, (96-106)
- McCurdy C and Mellor-Crummey J An evaluation of computing paradigms for N-body simulations on distributed memory architectures Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, (25-36)
- MacBeth M, McGuigan K and Hatcher P Executing Java threads in parallel in a distributed-memory environment Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
- Adve V, Jin G, Mellor-Crummey J and Yi Q High performance Fortran compilation techniques for parallelizing scientific codes Proceedings of the 1998 ACM/IEEE conference on Supercomputing, (1-23)
- Park N, Prasanna V and Raghavendra C Efficient algorithms for block-cyclic array redistribution between processor sets Proceedings of the 1998 ACM/IEEE conference on Supercomputing, (1-13)
- Abdelrahman T and Wong T (1998). Compiler Support for Array Distribution onNUMA Shared Memory Multiprocessors, The Journal of Supercomputing, 12:4, (349-371), Online publication date: 1-Oct-1998.
- Kohl J and Papadopoulas P Efficient and flexible fault tolerance and migration of scientific simulations using CUMULVS Proceedings of the SIGMETRICS symposium on Parallel and distributed tools, (60-71)
- Müller M, Warschko T and Tichy W Prefetching on the Cray-T3E Proceedings of the 12th international conference on Supercomputing, (361-368)
- Coddington P and Ko S Techniques for empirical testing of parallel random number generators Proceedings of the 12th international conference on Supercomputing, (282-288)
- Kennedy K and Kremer U (1998). Automatic data layout for distributed-memory machines, ACM Transactions on Programming Languages and Systems (TOPLAS), 20:4, (869-916), Online publication date: 1-Jul-1998.
- Agrawal G (1998). Interprocedural Partial Redundancy Elimination With Application to Distributed Memory Compilation, IEEE Transactions on Parallel and Distributed Systems, 9:7, (609-625), Online publication date: 1-Jul-1998.
- Adve V and Mellor-Crummey J (1998). Using integer sets for data-parallel program analysis and optimization, ACM SIGPLAN Notices, 33:5, (186-198), Online publication date: 1-May-1998.
- Adve V and Mellor-Crummey J Using integer sets for data-parallel program analysis and optimization Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, (186-198)
- Moreira J and Midkiff S (1998). Fortran 90 in CSE, IEEE Computational Science & Engineering, 5:2, (39-49), Online publication date: 1-Apr-1998.
- Desprez F, Randriamaro C, Dongarra J, Petitet A and Robert Y (1998). Scheduling Block-Cyclic Array Redistribution, IEEE Transactions on Parallel and Distributed Systems, 9:2, (192-205), Online publication date: 1-Feb-1998.
- Kandemir M, Choudhary A, Ramanujam J and Kandaswamy M A unified compiler algorithm for optimizing locality, parallelism and communication in out-of-core computations Proceedings of the fifth workshop on I/O in parallel and distributed systems, (79-92)
- Nucciarone J, Özyörük Y and Long L New life in dusty decks Proceedings of the 1997 ACM/IEEE conference on Supercomputing, (1-19)
- Kennedy K, Bender C, Connolly J, Hennessy J, Vernon M and Smarr L (1997). A nationwide parallel computing environment, Communications of the ACM, 40:11, (62-72), Online publication date: 1-Nov-1997.
- Ramaswamy S, Sapatnekar S and Banerjee P (1997). A Framework for Exploiting Task and Data Parallelism on Distributed Memory Multicomputers, IEEE Transactions on Parallel and Distributed Systems, 8:11, (1098-1116), Online publication date: 1-Nov-1997.
- Nieplocha J and Harrison R (2019). Shared Memory Programming in Metacomputing Environments, The Journal of Supercomputing, 11:2, (119-136), Online publication date: 1-Oct-1997.
- Geist G, Kohl J and Papadopoulos P (1997). Cumulvs, International Journal of High Performance Computing Applications, 11:3, (224-235), Online publication date: 1-Sep-1997.
- Lim Y, Park N and Prasanna V Efficient Algorithms for Multi-dimensional Block-Cyclic Redistribution of Arrays Proceedings of the international Conference on Parallel Processing, (234-241)
- Lee J, Ho D and Chuang Y Data Distribution Analysis and Optimization for Pointer-Based Distributed Programs Proceedings of the international Conference on Parallel Processing, (56-63)
- Tandri S and Abdelrahman T Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors Proceedings of the international Conference on Parallel Processing, (64-73)
- Lee P (1997). Efficient Algorithms for Data Distribution on Distributed Memory Parallel Computers, IEEE Transactions on Parallel and Distributed Systems, 8:8, (825-839), Online publication date: 1-Aug-1997.
- Ancourt C, Barthou D, Guettier C, Irigoin F, Jeannet B, Jourdan J and Mattioli J Automatic data mapping of signal processing applications Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
- Coelho F (1997). Compiling dynamic mappings with array copies, ACM SIGPLAN Notices, 32:7, (168-179), Online publication date: 1-Jul-1997.
- Subhlok J and Yang B (1997). A new model for integrated nested task and data parallel programming, ACM SIGPLAN Notices, 32:7, (1-12), Online publication date: 1-Jul-1997.
- Coelho F Compiling dynamic mappings with array copies Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming, (168-179)
- Subhlok J and Yang B A new model for integrated nested task and data parallel programming Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming, (1-12)
- Moreira J and Naik V (2019). Dynamic resource management on distributed systems using reconfigurable applications, IBM Journal of Research and Development, 41:3, (303-330), Online publication date: 1-May-1997.
- Chandra R, Chen D, Cox R, Maydan D, Nedeljkovic N and Anderson J (1997). Data distribution support on distributed shared memory multiprocessors, ACM SIGPLAN Notices, 32:5, (334-345), Online publication date: 1-May-1997.
- Chandra R, Chen D, Cox R, Maydan D, Nedeljkovic N and Anderson J Data distribution support on distributed shared memory multiprocessors Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, (334-345)
- Koo M, Park S, Yook H and Park M A transformation method to reduce loop overhead in HPF compiler Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
- Mak K and Chan R Parallel Implementation of 2-Dimensional Toeplitz Solver on MasPar with Applications to Image Restoration Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
- Melin E, Raffin B, Rebeuf X and Virot B SCL-chan Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
- Kandemir M, Bordawekar R and Choudhary A Data Access Reorganizations in Compiling Out-of-Core Data Parallel Programs on Distributed Memory Machines Proceedings of the 11th International Symposium on Parallel Processing
- Edjlali G, Sussman A and Saltz J Interoperability of Data Parallel Runtime Libraries Proceedings of the 11th International Symposium on Parallel Processing, (451-459)
- Keleher P and Tseng C Enhancing Software DSM for Compiler-Parallelized Applications Proceedings of the 11th International Symposium on Parallel Processing, (490-499)
- Gupta M On Privatization of Variables for Data-Parallel Execution Proceedings of the 11th International Symposium on Parallel Processing, (533-541)
- Koo M, Park S, Yook H and Park M A New Transformation Method to Generate Optimized DO Loop from FORALL Construct Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
- Gillett R and Kaufmann R (1997). Using the Memory Channel Network, IEEE Micro, 17:1, (19-25), Online publication date: 1-Jan-1997.
- Akarsu E, Dincer K, Haupt T and Fox G Particle-in-cell simulation codes in High Performance Fortran Proceedings of the 1996 ACM/IEEE conference on Supercomputing, (38-es)
- Foster I, Kohr D, Krishnaiyer R and Choudhary A Double standards Proceedings of the 1996 ACM/IEEE conference on Supercomputing, (36-es)
- Blackford L, Choi J, Cleary A, Petitet A, Whaley R, Demmel J, Dhillon I, Stanley K, Dongarra J, Hammarling S, Henry G and Walker D ScaLAPACK Proceedings of the 1996 ACM/IEEE conference on Supercomputing, (5-es)
- Kaushik S, Huang C and Sadayappan P (2019). Efficient Index Set Generation for Compiling HPF Array Statements on Distributed-Memory Machines, Journal of Parallel and Distributed Computing, 38:2, (237-247), Online publication date: 1-Nov-1996.
- Hall M, Hiranandani S, Kennedy K and Tseng C (2019). Interprocedural Compilation of Fortran D, Journal of Parallel and Distributed Computing, 38:2, (114-129), Online publication date: 1-Nov-1996.
- Wilson G and Bal H (1996). Using the Cowichan Problems to Assess the Usability of Orca, IEEE Parallel & Distributed Technology: Systems & Technology, 4:3, (36-44), Online publication date: 1-Sep-1996.
- Goldstein S, Schauser K and Culler D (2019). Lazy Threads, Journal of Parallel and Distributed Computing, 37:1, (5-20), Online publication date: 25-Aug-1996.
- Dincer K, Fox G and Hawick K High Performance Fortran and Possible Extensions to Support Conjugate Gradient Algorithms Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
- Fang N Engineering Parallel Algorithms Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
- Dongarra J, Otto S, Snir M and Walker D (1996). A message passing standard for MPP and workstations, Communications of the ACM, 39:7, (84-90), Online publication date: 1-Jul-1996.
- Thakur R, Choudhary A and Ramanujam J (1996). Efficient Algorithms for Array Redistribution, IEEE Transactions on Parallel and Distributed Systems, 7:6, (587-594), Online publication date: 1-Jun-1996.
- Toledo S Performance Prediction with Benchmaps Proceedings of the 10th International Parallel Processing Symposium, (479-485)
- Saini S NAS Experiences of Porting CM Fortran Codes to on IBM SP2 and SGI Power Challenge Proceedings of the 10th International Parallel Processing Symposium, (878-880)
- McMahon J and Teitelbaum K Space-Time Adaptive Processing on the Mesh Synchronous Processor Proceedings of the 10th International Parallel Processing Symposium, (734-740)
- Bae S and Ranka S PACK/UNPACK on Coarse-Grained Distributed Memory Parallel Machines Proceedings of the 10th International Parallel Processing Symposium, (320-324)
- Kaufmann R and Reddin T Digital's clusters and scientific parallel applications Proceedings of the 41st IEEE International Computer Conference
- Amza C, Cox A, Dwarkadas S, Keleher P, Lu H, Rajamony R, Yu W and Zwaenepoel W (1996). TreadMarks, Computer, 29:2, (18-28), Online publication date: 1-Feb-1996.
- Hood R The p2d2 project Proceedings of the SIGMETRICS symposium on Parallel and distributed tools, (127-136)
- Acharya A Eliminating redundant barrier synchronizations in rule-based programs Proceedings of the 10th international conference on Supercomputing, (325-332)
- Ranganathan M, Acharya A, Edjlali G, Sussman A and Saltz J Runtime coupling of data-parallel programs Proceedings of the 10th international conference on Supercomputing, (229-236)
- Krishnan S and Kale L Automating parallel runtime optimizations using post-mortem analysis Proceedings of the 10th international conference on Supercomputing, (221-228)
- Lain A and Banerjee P Compiler support for hybrid irregular accesses on multicomputers Proceedings of the 10th international conference on Supercomputing, (1-9)
- Gupta M, Midkiff S, Schonberg E, Seshadri V, Shields D, Wang K, Ching W and Ngo T An HPF compiler for the IBM SP2 Proceedings of the 1995 ACM/IEEE conference on Supercomputing, (71-es)
- Adve V, Mellor-Crummey J, Anderson M, Wang J, Reed D and Kennedy K An integrated compilation and performance analysis environment for data parallel programs Proceedings of the 1995 ACM/IEEE conference on Supercomputing, (50-es)
- Agrawal G and Saltz J Interprocedural compilation of irregular applications for distributed memory machines Proceedings of the 1995 ACM/IEEE conference on Supercomputing, (48-es)
- Banerjee P, Chandy J, Gupta M, Hodges IV E, Holm J, Lain A, Palermo D, Ramaswamy S and Su E (1995). The Paradigm Compiler for Distributed-Memory Multicomputers, Computer, 28:10, (37-47), Online publication date: 1-Oct-1995.
- Kennedy K, Nedeljkovic N and Sethi A (1995). A linear-time algorithm for computing the memory access sequence in data-parallel programs, ACM SIGPLAN Notices, 30:8, (102-111), Online publication date: 1-Aug-1995.
- Kennedy K, Nedeljkovic N and Sethi A A linear-time algorithm for computing the memory access sequence in data-parallel programs Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, (102-111)
- Su E, Lain A, Ramaswamy S, Palermo D, Hodges E and Banerjee P Advanced compilation techniques in the PARADIGM compiler for distributed-memory multicomputers Proceedings of the 9th international conference on Supercomputing, (424-433)
- Bergmark D Optimization and parallelization of a commodity trade model for the IBM SP1/2, using parallel programming tools Proceedings of the 9th international conference on Supercomputing, (227-236)
- Kennedy K, Nedeljkovic N and Sethi A Efficient address generation for block-cyclic distributions Proceedings of the 9th international conference on Supercomputing, (180-184)
- Müller A and Rühl R Extending high performance Fortran for the support of unstructured computations Proceedings of the 9th international conference on Supercomputing, (127-136)
- Agrawal G, Sussman A and Saltz J (1995). An Integrated Runtime and Compile-Time Approach for Parallelizing Structured and Block Structured Applications, IEEE Transactions on Parallel and Distributed Systems, 6:7, (747-754), Online publication date: 1-Jul-1995.
- Das R, Wu J, Saltz J, Berryman H and Hiranandani S (1995). Distributed Memory Compiler Design For Sparse Problems, IEEE Transactions on Computers, 44:6, (737-753), Online publication date: 1-Jun-1995.
- Merlin J and Hey A (2018). An Introduction to High Performance Fortran, Scientific Programming, 4:2, (87-113), Online publication date: 1-Apr-1995.
- Chatterjee S, Gilbert J, Long F, Schreiber R and Teng S (2019). Generating Local Addresses and Communication Sets for Data-Parallel Programs, Journal of Parallel and Distributed Computing, 26:1, (72-84), Online publication date: 1-Apr-1995.
- Ponnusamy R, Hwang Y, Das R, Saltz J, Choudhary A and Fox G (1995). Supporting Irregular Distributions Using Data-Parallel Languages, IEEE Parallel & Distributed Technology: Systems & Technology, 3:1, (12-24), Online publication date: 1-Mar-1995.
- Cheng D and Hood R A portable debugger for parallel and distributed programs Proceedings of the 1994 ACM/IEEE conference on Supercomputing, (723-732)
- Sharma S, Ponnusamy R, Moon B, Hwang Y, Das R and Saltz J Run-time and compile-time support for adaptive irregular problems Proceedings of the 1994 ACM/IEEE conference on Supercomputing, (97-106)
- Ching W and Katz A An experimental APL compiler for a distributed memory parallel machine Proceedings of the 1994 ACM/IEEE conference on Supercomputing, (59-68)
- Adve V, Carle A, Granston E, Hiranandani S, Kennedy K, Koelbel C, Kremer U, Mellor-Crummey J, Warren S and Tseng C (1994). Requirements for Data-Parallel Programming Environments, IEEE Parallel & Distributed Technology: Systems & Technology, 2:3, (48-58), Online publication date: 1-Sep-1994.
- Foster I (1994). Task Parallelism and High-Performance Languages, IEEE Parallel & Distributed Technology: Systems & Technology, 2:3, (27-36), Online publication date: 1-Sep-1994.
- Bixby R, Kennedy K and Kremer U Automatic Data Layout Using 0-1 Integer Programming Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques, (111-122)
- von Hanxleden R and Kennedy K GIVE-N-TAKE—a balanced code placement framework Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation, (107-120)
- Hiranandani S, Kennedy K, Mellor-Crummey J and Sethi A Compilation techniques for block-cyclic distributions Proceedings of the 8th international conference on Supercomputing, (392-403)
- von Hanxleden R and Kennedy K (2019). GIVE-N-TAKE—a balanced code placement framework, ACM SIGPLAN Notices, 29:6, (107-120), Online publication date: 1-Jun-1994.
- Choudhary A, Koelbel C and Zosel M High performance Fortran Proceedings of the 1993 ACM/IEEE conference on Supercomputing, (610-613)
Index Terms
- The high performance Fortran handbook
Recommendations
High Performance Fortran: Language Specification (PART II)
Special issue: high performance Fortran language specification, part 2(PART II)Fortran Forum is reprinting this High Performance Fortran Language Specification over several issues. Sections 1-4 of the Specification appeared as Fortran Forum (12:4), December 1993. The current issue is devoted to Sections 5-7. Remaining ...