SPARK: A Parallelizing Approach to the High-Level Synthesis of Digital Circuits
... DIGITAL CIRCUITS SUMIT GUPTA Center for Embedded Computer Systems University of California Sa... more ... DIGITAL CIRCUITS SUMIT GUPTA Center for Embedded Computer Systems University of California San Diego and Irvine, USA RAJESH K. GUPTA Department of Computer Science and Engineering University of California San Diego, USA ...
International Journal of Parallel Programming, 1995
Multiple-functional-unit architectures allow one to boost performance by simultaneously executing... more Multiple-functional-unit architectures allow one to boost performance by simultaneously executing many operations, but technological constraints limit the achievable register-file I/O bandwith and prevent one from fully exploiting the benefits of a large number of units. Dividing the register set into multiple banks can improve the overall I/O bandwidth but determines a nonhomogeneous register space onto which variables must be allocated subject to register-file-port constraining. We propose a hypergraph-based paradigm for modeling competition among variables for port-allocation on multiple-register-file VLIW architectures; by coloring such a hypergraph, we can identify legal allocations of variables to register banks and produce executable code.
Horizontally microprogrammable CPUs belong to a class of machines having statically schedulable p... more Horizontally microprogrammable CPUs belong to a class of machines having statically schedulable parallel instruction execution (SPIE machines). Several experiments have shown that within basic blocks, real code only gives a potential speed-up factor of 2 or 3 when compacted for SPIE machines, even in the presence of unlimited hardware. In this paper, similar experiments are described. However, these measure the
Proceedings of the 28th conference on ACM/IEEE design automation conference - DAC '91, 1991
A new local and incremental Tree Height Reduction (THR) technique for parallelization of applicat... more A new local and incremental Tree Height Reduction (THR) technique for parallelization of application programs is presented. Although THR was introduced many years ago-it haz not been widely used in HLS scheduling systems. The two main reasons for that were the inability of most systems to compact beyond basic blocks of the program, thus limiting the strength of THR and
The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings., 2004
. Trade-offs between code selection, register allocation, and instructionscheduling are inherentl... more . Trade-offs between code selection, register allocation, and instructionscheduling are inherently interdependent, especially when compiling forfine-grain parallel architectures. However, the conventional approach to compilingfor such machines arbitrarily separates these phases so that decisions madeduring any one phase place unnecessary constraints on the remaining phases.Mutation Scheduling attempts to solve this problem by combining code selection,register allocation, and instruction...
International Journal of Parallel Programming, 1995
In this paper we extend Percolation Scheduling (PS) to navigate through a hierarchical version of... more In this paper we extend Percolation Scheduling (PS) to navigate through a hierarchical version of the Control Flow Graph (CFG) representation of a VLIW program. This extension retains the completeness of PS by allowing the “normal” PS transformations to be applied incrementally between adjacent instructions but also enablesnonincremental code motions across arbitrarily large single-entry/single-exit regions of code in constant time. Such nonincremental transformations eliminate the useless code explosions that would otherwise be caused by using incremental transformations to move operations through regions containing multiple control paths and, in conjunction with the hierarchical representation of the CFG, provide a framework for trading offuseful code explosions for increases in parallelism. Simulation results comparing nonincremental with incremental PS are presented.
1993 International Conference on Parallel Processing - ICPP'93 Vol2, 1993
: Percolation Scheduling (PS) is a system for performingparallelizing transformations for the VLI... more : Percolation Scheduling (PS) is a system for performingparallelizing transformations for the VLIW and superscalarcomputation models. PS has various useful properties,such as completeness with respect to local transformations, andappears to be an effective means of exploiting instruction levelparallelism. However, compilers based on PS typically sufferfrom inefficiencies caused by the incremental application of PStransformations and significant code explosion. In this paperwe
SPARK: A Parallelizing Approach to the High-Level Synthesis of Digital Circuits
... DIGITAL CIRCUITS SUMIT GUPTA Center for Embedded Computer Systems University of California Sa... more ... DIGITAL CIRCUITS SUMIT GUPTA Center for Embedded Computer Systems University of California San Diego and Irvine, USA RAJESH K. GUPTA Department of Computer Science and Engineering University of California San Diego, USA ...
International Journal of Parallel Programming, 1995
Multiple-functional-unit architectures allow one to boost performance by simultaneously executing... more Multiple-functional-unit architectures allow one to boost performance by simultaneously executing many operations, but technological constraints limit the achievable register-file I/O bandwith and prevent one from fully exploiting the benefits of a large number of units. Dividing the register set into multiple banks can improve the overall I/O bandwidth but determines a nonhomogeneous register space onto which variables must be allocated subject to register-file-port constraining. We propose a hypergraph-based paradigm for modeling competition among variables for port-allocation on multiple-register-file VLIW architectures; by coloring such a hypergraph, we can identify legal allocations of variables to register banks and produce executable code.
Horizontally microprogrammable CPUs belong to a class of machines having statically schedulable p... more Horizontally microprogrammable CPUs belong to a class of machines having statically schedulable parallel instruction execution (SPIE machines). Several experiments have shown that within basic blocks, real code only gives a potential speed-up factor of 2 or 3 when compacted for SPIE machines, even in the presence of unlimited hardware. In this paper, similar experiments are described. However, these measure the
Proceedings of the 28th conference on ACM/IEEE design automation conference - DAC '91, 1991
A new local and incremental Tree Height Reduction (THR) technique for parallelization of applicat... more A new local and incremental Tree Height Reduction (THR) technique for parallelization of application programs is presented. Although THR was introduced many years ago-it haz not been widely used in HLS scheduling systems. The two main reasons for that were the inability of most systems to compact beyond basic blocks of the program, thus limiting the strength of THR and
The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings., 2004
. Trade-offs between code selection, register allocation, and instructionscheduling are inherentl... more . Trade-offs between code selection, register allocation, and instructionscheduling are inherently interdependent, especially when compiling forfine-grain parallel architectures. However, the conventional approach to compilingfor such machines arbitrarily separates these phases so that decisions madeduring any one phase place unnecessary constraints on the remaining phases.Mutation Scheduling attempts to solve this problem by combining code selection,register allocation, and instruction...
International Journal of Parallel Programming, 1995
In this paper we extend Percolation Scheduling (PS) to navigate through a hierarchical version of... more In this paper we extend Percolation Scheduling (PS) to navigate through a hierarchical version of the Control Flow Graph (CFG) representation of a VLIW program. This extension retains the completeness of PS by allowing the “normal” PS transformations to be applied incrementally between adjacent instructions but also enablesnonincremental code motions across arbitrarily large single-entry/single-exit regions of code in constant time. Such nonincremental transformations eliminate the useless code explosions that would otherwise be caused by using incremental transformations to move operations through regions containing multiple control paths and, in conjunction with the hierarchical representation of the CFG, provide a framework for trading offuseful code explosions for increases in parallelism. Simulation results comparing nonincremental with incremental PS are presented.
1993 International Conference on Parallel Processing - ICPP'93 Vol2, 1993
: Percolation Scheduling (PS) is a system for performingparallelizing transformations for the VLI... more : Percolation Scheduling (PS) is a system for performingparallelizing transformations for the VLIW and superscalarcomputation models. PS has various useful properties,such as completeness with respect to local transformations, andappears to be an effective means of exploiting instruction levelparallelism. However, compilers based on PS typically sufferfrom inefficiencies caused by the incremental application of PStransformations and significant code explosion. In this paperwe
Uploads
Papers by Alex Nicolau