Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- articleDecember 2014
Design patterns percolating to parallel programming framework implementation
International Journal of Parallel Programming (IJPP), Volume 42, Issue 6Pages 1012–1031https://doi.org/10.1007/s10766-013-0273-6Structured parallel programming is recognised as a viable and effective means of tackling parallel programming problems. Recently, a set of simple and powerful parallel building blocks ( $$\mathsf{RISC\text{- }pb^2l}$$ RISC - pb 2 l ) has been proposed to support modelling and implementation of ...
- articleApril 2006
A platform-independent distributed runtime for standard multithreaded Java
International Journal of Parallel Programming (IJPP), Volume 34, Issue 2Pages 113–142https://doi.org/10.1007/s10766-006-0007-0JavaSplit is a portable runtime environment for distributed execution of standard multithreaded Java programs. It gains augmented computational power and increased memory capacity by distributing the threads and objects of an application among the ...
- articleFebruary 2004
Alias analysis in Java with reference-set representation for high-performance computing
International Journal of Parallel Programming (IJPP), Volume 32, Issue 1Pages 39–76https://doi.org/10.1023/B:IJPP.0000015564.82048.f3In this paper, a flow-sensitive, context-insensitive alias analysis in Java is proposed. It is more efficient and precise than previous analyses for C++, and it does not negatively affect the safety of aliased references. To this end, we first present a ...
- articleAugust 2003
Restructuring computations for temporal data cache locality
International Journal of Parallel Programming (IJPP), Volume 31, Issue 4Pages 305–338https://doi.org/10.1023/A:1024556711058Data access costs contribute significantly to the execution time of applications with complex data structures. A the latency of memory accesses becomes high relative to processor cycle times, application performance is increasingly limited by memory ...
- articleJune 2002
Control Flow Regeneration for Software Pipelined Loops with Conditions
International Journal of Parallel Programming (IJPP), Volume 30, Issue 3Pages 149–179https://doi.org/10.1023/A:1015453520790A new intermediate representation for software pipelined loops with conditions is proposed in the paper. The representation allows separation of operations from different paths and their conditional, as well as speculative scheduling, including ...
-
- articleOctober 2000
Data Dependence Analysis of Assembly Code
International Journal of Parallel Programming (IJPP), Volume 28, Issue 5Pages 431–467https://doi.org/10.1023/A:1007588710878Determination of data dependences is a task typically performed with high-level language source code in today's optimizing and parallelizing compilers. Very little work has been done in the field of data dependence analysis on assembly language code, ...
- articleOctober 2000
Loop Shifting for Loop Compaction
International Journal of Parallel Programming (IJPP), Volume 28, Issue 5Pages 499–534https://doi.org/10.1023/A:1007506711786The idea of decomposed software pipelining is to decouple the software pipelining problem into a cyclic scheduling problem without resource constraints and an acyclic scheduling problem with resource constraints. In terms of loop transformation and code ...
- articleJune 1999
Nonsingular Data Transformations: Definition, Validity, and Applications
International Journal of Parallel Programming (IJPP), Volume 27, Issue 3Pages 131–159https://doi.org/10.1023/A:1018744411700This paper describes a unifying framework for nonsingular data transformations. It shows that a wide class of existing transformations may be expressed in this framework, allowing compound transformations to be performed in one step. Validity conditions ...
- articleDecember 1998
Reuse-Driven Tiling for Improving Data Locality
This paper applies unimodular transformations and tiling to improve data locality of a loop nest. Due to data dependences and reuse information, not all dimensions of the iteration space will and can be tiled. By using cones to represent data ...
- articleAugust 1998
Combining Loop Transformations Considering Caches and Scheduling
The performance of modern microprocessors is greatly affected by cache behavior, instruction scheduling, register allocation and loop overhead. High-level loop transformations such as fission, fusion, tiling, interchanging and outer loop unrolling (e.g.,...
- articleAugust 1998
Meld Scheduling: A Technique for Relaxing Scheduling Constraints
Meld scheduling melds the schedules of neighboring scheduling regions to respect latencies of operations issued in one region but completing after control transfers to the other. In contrast, conventional schedulers ignore latency constraints from other ...
- articleApril 1998
Quantitative Evaluation of Register Pressure on Software Pipelined Loops
International Journal of Parallel Programming (IJPP), Volume 26, Issue 2Pages 121–142https://doi.org/10.1023/A:1018743102645Software Pipelining is a loop scheduling technique that extracts loop parallelism by overlapping the execution of several consecutive iterations. One of the drawbacks of software pipelining is its high register requirements, which increase with the ...
- articleDecember 1996
Connection Analysis: A Practical Interprocedural Heap Analysis for C
This paper presents a practical heap analysis technique, connection analysis, that can be used to disambiguate heap accesses in C programs. The technique is designed for analyzing programs that allocate many disjoint objects in the heap such as ...
- articleAugust 1996
A Study of the EARTH-MANNA Multithreaded System
Multithreaded architectures have been proposed for future multiprocessor systems. However, some open issues remain. Can multithreading be supported in a multiprocessor so that it can tolerate synchronization and communication latencies, with little ...
- articleAugust 1996
A Partitioning-Independent Paradigm for Nested Data Parallelism
A generalization of the data parallel model has been proposed by Blelloch which permits the nesting of data parallel operators to specify parallel computation across nested and irregular data structures. In this paper we consider the costs of supporting ...
- articleApril 1996
Minimizing Register Requirements of a Modulo Schedule via Optimum Stage Scheduling
Modulo scheduling is an efficient technique for exploiting instruction level parallelism in a variety of loops, resulting in high performance code but increased register requirements. We present an approach that schedules the loop operations for minimum ...
- articleApril 1996
Hardware-Based Profiling: An Effective Technique for Profile-Driven Optimization
Profile-based optimization can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs ...