Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleMarch 2004
A Compiler Scheme for Reusing Intermediate Computation Results
Recent research has shown that programs often exhibitvalue locality. Such locality occurs when a code segment,although executed repeatedly in the program, takes only asmall number of different values as input and, naturally,generates a small number of ...
- ArticleMarch 2004
Optimizing Translation Out of SSA Using Renaming Constraints
Static Single Assignment form is an intermediate representationthat usesinstructions to merge values ateach confluent point of the control flow graph. instructionsare not machine instructions and must be renamedback to move instructions when translating ...
- ArticleMarch 2004
Exposing Memory Access Regularities Using Object-Relative Memory Profiling
Memory profiling is the process of characterizing a program's memorybehavior by observing and recording its response to specific inputsets. Relevant aspects of the program's memory behavior maythen be used to guide memory optimizations in an ...
- ArticleMarch 2004
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
This paper describes LLVM (Low Level Virtual Machine),a compiler framework designed to support transparent, lifelongprogram analysis and transformation for arbitrary programs,by providing high-level information to compilertransformations at compile-time,...
- ArticleMarch 2004
VHC: Quickly Building an Optimizer for Complex Embedded Architectures
To meet the high demand for powerful embedded processors,VLIW architectures are increasingly complex (e.g.,multiple clusters), and moreover, they now run increasinglysophisticated control-intensive applications. As a result, developingarchitecture-...
- ArticleMarch 2004
Exploring the Performance Potential of Itanium® Processors with ILP-based Scheduling
HP and Intel's Itanium Processor Family (IPF) isconsidered as one of the most challenging processorarchitectures to generate code for.During global instructionscheduling, the compiler must balance the useof strongly interdependent techniques like code ...
- ArticleMarch 2004
SYZYGY - A Framework for Scalable Cross-Module IPO
- Sungdo Moon,
- Xinliang D. Li,
- Robert Hundt,
- Dhruva R. Chakrabarti,
- Luis A. Lozano,
- Uma Srinivasan,
- Shin-Ming Liu
Performing analysis across module boundariesfor an entire program is important for exploitingseveral runtime performance opportunities. However,due to scalability problems in existing full-programanalysis frameworks, such performance opportunitiesare ...
- ArticleMarch 2004
Compiler Optimization of Memory-Resident Value Communication Between Speculative Threads
Efficient inter-thread value communication is essential for improving performance in Thread-Level Speculation (TLS). Although several mechanisms for improving value communication using hardware support have been proposed, there is relatively little work ...
- ArticleMarch 2004
Ispike: A Post-link Optimizer for the Intel®Itanium®Architecture
Ispike is post-link optimizer developed for theIntel®Itanium Processor Family (IPF) processors.TheIPF architecture poses both opportunities and challenges topost-link optimizations.IPF offers a rich set of performancecounters to collect detailed profile ...
- ArticleMarch 2004
Exploring Code Cache Eviction Granularities in Dynamic Optimization Systems
Dynamic optimization systems store optimized or translatedcode in a software-managed code cache in order tomaximize reuse of transformed code. Code caches storesuperblocks that are not fixed in size, may contain linksto other superblocks, and carry a ...
- ArticleMarch 2004
A Dynamically Tuned Sorting Library
Empirical search is a strategy used during the installation oflibrary generators such as ATLAS, FFTW, and SPIRAL to identify the algorithm or the version of an algorithm that delivers thebest performance. In the past, empirical search has been ...
- ArticleMarch 2004
The Accuracy of Initial Prediction in Two-Phase Dynamic Binary Translators
Dynamic binary translators use a two-phase approachto identify and optimize frequently executed codedynamically. In the first step (profiling phase), blocks ofcode are interpreted or quickly translated to collectexecution frequency information for the ...
- ArticleMarch 2004
Targeted Path Profiling: Lower Overhead Path Profiling for Staged Dynamic Optimization Systems
In this paper, we present a technique for reducing theoverhead of collecting path profiles in the context of a dynamicoptimizer. The key idea to our approach, called TargetedPath Profiling (TPP), is to use an edge profile to simplifythe collection of a ...
- ArticleMarch 2004
Extending Path Profiling across Loop Backedges and Procedure Boundaries
Since their introduction, path profiles have been used toguide the application of aggressive code optimizations andperforming instruction scheduling. However, for optimizationand scheduling, it is often desirable to obtain frequencycounts of paths that ...
- ArticleMarch 2004
Code Generation for Single-Dimension Software Pipelining of Multi-Dimensional Loops
Traditionally, software pipelining is applied either to theinnermost loop of a given loop nest or from the innermostloop to the outer loops. In a companion paper, we proposeda scheduling method, called Single-dimension SoftwarePipelining (SSP), to ...
- ArticleMarch 2004
Specialized Dynamic Optimizations for High-Performance Energy-Efficient Microarchitecture
We study several major characteristics of dynamic optimizationwithin the PARROT power-aware, trace-cache-basedmicroarchitectural framework. We investigate thebenefit of providing optimizations which although tightlycoupled with the microarchitecture in ...