Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1636712guideproceedingsBook PagePublication PagespactConference Proceedingsconference-collections
PACT '09: Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
2009 Proceeding
Publisher:
  • IEEE Computer Society
  • 1730 Massachusetts Ave., NW Washington, DC
  • United States
Conference:
September 12 - 16, 2009
ISBN:
978-0-7695-3771-9
Published:
12 September 2009
Next Conference
Reflects downloads up to 06 Oct 2024Bibliometrics
Abstract

No abstract available.

Article
Cover Art
Article
Article
Article
Article
Adaptive Locks: Combining Transactions and Locks for Efficient Concurrency

Transactional memory is being advanced as an alternative to traditional lock-based synchronization for concurrent programming. Transactional memory simplifies the programming model and maximizes concurrency. At the same time, transactions can suffer ...

Article
Anaphase: A Fine-Grain Thread Decomposition Scheme for Speculative Multithreading

Industry is moving towards multi-core designs as we have hit the memory and power walls. Multi-core designs are very effective to exploit thread-level parallelism (TLP) but do not provide benefits when executing serial code (applications with low TLP, ...

Article
Characterizing the TLB Behavior of Emerging Parallel Workloads on Chip Multiprocessors

Translation Lookaside Buffers (TLBs) are a staple in modern computer systems and have a significant impact on overall system performance. Numerous prior studies have addressed TLB designs to lower access times and miss rates; these, however, have been ...

Article
Interprocedural Load Elimination for Dynamic Optimization of Parallel Programs

Load elimination is a classical compiler transformation that is increasing in importance for multi-core and many-core architectures. The effect of the transformation is to replace a memory access, such as a read of an object field or an array element, ...

Article
Quantifying the Potential of Program Analysis Peripherals

Tools such as multi-threaded data race detectors, memory bounds checkers, dynamic type analyzers, data flight recorders, and various performance profilers are becoming increasingly vital aids to software developers. Rather than performing all the ...

Article
Algorithmic Skeletons within an Embedded Domain Specific Language for the CELL Processor

Efficiently using the hardware capabilities of the Cell processor, a heterogeneous chip multiprocessor that uses several levels of parallelism to deliver high performance, and being able to reuse legacy code are real challenges for application ...

Article
A Task-Centric Memory Model for Scalable Accelerator Architectures

This paper presents a task-centric memory model for 1000-core compute accelerators.Visual computing applications are emerging as an important class of workloads that can exploit 1000-core processors.In these workloads, we observe data sharing and ...

Article
SHIP: Scalable Hierarchical Power Control for Large-Scale Data Centers

In today's data centers, precisely controlling server power consumption is an essential way to avoid system failures caused by power capacity overload or overheating due to increasingly high server density. While various power control strategies have ...

Article
Exploring Phase Change Memory and 3D Die-Stacking for Power/Thermal Friendly, Fast and Durable Memory Architectures

Emerging three-dimensional (3D) integration technology allows for the direct placement of DRAM on top of a microprocessor, significantly reducing the wire-delay between the two and thereby alleviating memory latency and bandwidth constraints. However, ...

Article
Core-Selectability in Chip Multiprocessors

The centralized structures necessary for the extraction of instruction-level parallelism (ILP) are consuming progressively smaller portions of the total die area of chip multiprocessors (CMP). The reason for this is that scaling these structures does ...

Article
Chainsaw: Using Binary Matching for Relative Instruction Mix Comparison

With advances in hardware, instruction set architectures are undergoing continual evolution. As a result, compilers are under constant pressure to adapt and take full advantage of available features. However, current techniques for evaluating relative ...

Article
tm_db: A Generic Debugging Library for Transactional Programs

Transactional Memory (TM) has received a lot of attention as a programming API for concurrent programson emerging multicore architectures. If the transactionalprogramming model is to realize its promise of simplifyingthe problem of writing correct and ...

Article
StealthTest: Low Overhead Online Software Testing Using Transactional Memory

Software testing is hard. The emergence of multicore architectures and the proliferation of bugprone multithreaded software makes testing even harder. To this end, researchers have proposed methods to continue testing software after deployment, e.g., in ...

Article
CPROB: Checkpoint Processing with Opportunistic Minimal Recovery

CPR (Checkpoint Processing and Recovery) is a physical register management scheme that supports a larger instruction window and higher average IPC than conventional ROB-style register management.It does so by restricting mis-speculation recovery to ...

Article
Architecture Support for Improving Bulk Memory Copying and Initialization Performance

Bulk memory copying and initialization is one of the most ubiquitous operations performed in current computer systems by both user applications and Operating Systems. While many current systems rely on a loop of loads and stores, there are proposals to ...

Article
Oblivious Routing in On-Chip Bandwidth-Adaptive Networks

Oblivious routing can be implemented on simple router hardware, but network performance suffers when routes become congested. Adaptive routing attempts to avoid hot spots by re-routing flows, but requires more complex hardware to determine and configure ...

Article
Exploiting Parallelism with Dependence-Aware Scheduling

It is well known that a large fraction of applications cannot be parallelized at compile time due to unpredictable data dependences such as indirect memory accesses and/or memory accesses guarded by data-dependent conditional statements. A significant ...

Article
ITCA: Inter-task Conflict-Aware CPU Accounting for CMPs

Chip-MultiProcessor (CMP) architectures are becoming more and more popular as an alternative to the traditional processors that only extract instruction-level parallelism from an application. CMPs introduce complexities when accounting CPU utilization. ...

Article
Flextream: Adaptive Compilation of Streaming Applications for Heterogeneous Architectures

Increasing demand for performance and efficiency has driven the computer industry toward multicore systems. These systems have become the industry standard in almost all segments of the computer market from high-end servers to handheld devices. In order ...

Article
DDCache: Decoupled and Delegable Cache Data and Metadata

In order to harness the full compute power of many-core processors, future designs must focus on effective utilization of on-chip cache and bandwidth resources. In this paper, we address the dual goals of (1) reducing on-chip communication overheads and ...

Article
Zero-Value Caches: Cancelling Loads that Return Zero

The speed gap between processor and memory continues to limit performance. To address this problem, we explore the potential of eliminating Zero Loads — loads accessing memory locations that contain the value “zero” — to improve performance and energy ...

Article
Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning

Performance degradation of memory-intensive programs caused by the LRU policy's inability to handle weak-locality data accesses in the last level cache is increasingly serious for two reasons. First,the last-level cache remains in the CPU's critical ...

Index terms have been assigned to the content through auto-classification.

Recommendations

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%
YearSubmittedAcceptedRate
PACT '161193126%
PACT '141445438%
PACT '132083617%
Overall47112126%