Keyword: compiler analysis : Search

Article

Annotation of Compiler Attributes for MPI Functions

Recent Advances in the Message Passing InterfacePages 21–35https://doi.org/10.1007/978-3-031-73370-3_2

Abstract

This paper explores the use of LLVM IR function and parameter attributes to enhance compiler optimizations for code that uses MPI. As MPI is usually used as a dynamically linked library, the compiler is not able to automatically infer certain ...

research-article

Open Access

APT-GET: profile-guided timely software prefetching

EuroSys '22: Proceedings of the Seventeenth European Conference on Computer SystemsPages 747–764https://doi.org/10.1145/3492321.3519583

Prefetching which predicts future memory accesses and preloads them from main memory, is a widely-adopted technique to overcome the processor-memory performance gap. Unfortunately, hardware prefetchers implemented in today's processors cannot identify ...

research-article

Public Access

Compiler assisted hybrid implicit and explicit GPU memory management under unified address space

SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisArticle No.: 51, Pages 1–16https://doi.org/10.1145/3295500.3356141

To improve programmability and productivity, recent GPUs adopt a virtual memory address space shared with CPUs (e.g., NVIDIA's unified memory). Unified memory migrates the data management burden from programmers to system software and hardware, and ...

research-article

Compiler-Assisted and Profiling-Based Analysis for Fast and Efficient STT-MRAM On-Chip Cache Design

ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 24, Issue 4Article No.: 41, Pages 1–25https://doi.org/10.1145/3321693

Spin Transfer Torque Magnetic Random Access Memory (STT-MRAM) is a promising candidate for large on-chip memories as a zero-leakage, high-density and non-volatile alternative to the present SRAM technology. Since memories are the dominating component of ...

research-article

vSensor: leveraging fixed-workload snippets of programs for performance variance detection

PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 124–136https://doi.org/10.1145/3178487.3178497

Performance variance becomes increasingly challenging on current large-scale HPC systems. Even using a fixed number of computing nodes, the execution time of several runs can vary significantly. Many parallel programs executing on supercomputers suffer ...

Also Published in:

ACM SIGPLAN Notices: Volume 53 Issue 1

Article

Discovery and exploitation of general reductions: a constraint based approach

CGO '17: Proceedings of the 2017 International Symposium on Code Generation and OptimizationPages 269–280

Discovering and exploiting scalar reductions in programs has been studied for many years. The discovery of more complex reduction operations has, however, received less attention. Such reductions contain compile-time unknown parameters, indirect memory ...

research-article

IPAS: intelligent protection against silent output corruption in scientific applications

CGO '16: Proceedings of the 2016 International Symposium on Code Generation and OptimizationPages 227–238https://doi.org/10.1145/2854038.2854059

This paper presents IPAS, an instruction duplication technique that protects scientific applications from silent data corruption (SDC) in their output. The motivation for IPAS is that, due to natural error masking, only a subset of SDC errors actually ...

research-article

ExaSAT: An exascale co-design tool for performance modeling

International Journal of High Performance Computing Applications (SAGE-HPCA), Volume 29, Issue 2Pages 209–232https://doi.org/10.1177/1094342014568690

One of the emerging challenges to designing HPC systems is understanding and projecting the requirements of exascale applications. In order to determine the performance consequences of different hardware designs, analytic models are essential because ...

Article

PWCET: Power-Aware Worst Case Execution Time Analysis

ICPPW '14: Proceedings of the 2014 43rd International Conference on Parallel Processing WorkshopsPages 439–447https://doi.org/10.1109/ICPPW.2014.64

Worst case execution time (WCET) analysis is used to verify that real-time tasks on systems can be executed without violating any timing constraints. Power consumption is not considered in most of the WCET research work. However, real-time embedded ...

research-article

LORAIN: a step closer to the PDES 'holy grail'

SIGSIM PADS '14: Proceedings of the 2nd ACM SIGSIM Conference on Principles of Advanced Discrete SimulationPages 3–14https://doi.org/10.1145/2601381.2601397

Automatic parallelization of models has been the "Holy Grail" of the PDES community for the last 20 years. In this paper we present LORAIN -- Low Overhead Runtime Assisted Instruction Negation -- a tool capable of automatic emission of a reverse event ...

Article

Exact dependence analysis for increased communication overlap

EuroMPI'12: Proceedings of the 19th European conference on Recent Advances in the Message Passing InterfacePages 89–99https://doi.org/10.1007/978-3-642-33518-1_14

MPI programs are often challenged to scale up to several million cores. In doing so, the programmer tunes every aspect of the application code. However, for large applications, this is often not practical and expensive tracing tools and post-mortem ...

Article

From serial loops to parallel execution on distributed systems

Euro-Par'12: Proceedings of the 18th international conference on Parallel ProcessingPages 246–257https://doi.org/10.1007/978-3-642-32820-6_25

Programmability and performance portability are two major challenges in today's dynamic environment. Algorithm designers targeting efficient algorithms should focus on designing high-level algorithms exhibiting maximum parallelism, while relying on ...

research-article

Energy-efficient hardware data prefetching

IEEE Transactions on Very Large Scale Integration (VLSI) Systems (ITVL), Volume 19, Issue 2Pages 250–263https://doi.org/10.1109/TVLSI.2009.2032916

Extensive research has been done in prefetching techniques that hide memory latency in microprocessors leading to performance improvements. However, the energy aspect of prefetching is relatively unknown. While aggressive prefetching techniques often ...

research-article

Shoestring: probabilistic soft error reliability on the cheap

ASPLOS XV: Proceedings of the fifteenth International Conference on Architectural support for programming languages and operating systemsPages 385–396https://doi.org/10.1145/1736020.1736063

Aggressive technology scaling provides designers with an ever increasing budget of cheaper and faster transistors. Unfortunately, this trend is accompanied by a decline in individual device reliability as transistors become increasingly susceptible to ...

Also Published in:

ACM SIGARCH Computer Architecture News: Volume 38 Issue 1ACM SIGPLAN Notices: Volume 45 Issue 3

Article

Communication-Sensitive Static Dataflow for Parallel Message Passing Applications

Greg Bronevetsky

CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and OptimizationPages 1–12https://doi.org/10.1109/CGO.2009.32

Message passing is a very popular style of parallel programming, used in a wide variety of applications and supported by many APIs, such as BSD sockets, MPI and PVM. Its importance has motivated significant amounts of research on optimization and ...

poster

Exploiting global optimizations for openmp programs in the openuh compiler

PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 289–290https://doi.org/10.1145/1504176.1504219

The advent of new parallel architectures has increased the need for parallel optimizing compilers to assist developers in creating efficient code. OpenUH is a state-of-the-art optimizing compiler, but it only performs a limited set of optimizations for ...

Also Published in:

ACM SIGPLAN Notices: Volume 44 Issue 4

research-article

Optimizing irregular shared-memory applications for clusters

ICS '08: Proceedings of the 22nd annual international conference on SupercomputingPages 256–265https://doi.org/10.1145/1375527.1375566

Irregular applications pose challenges in optimizing communication, due to the difficulty of analyzing irregular data accesses accurately and efficiently. This challenge is especially big when translating irregular shared-memory applications to message-...

article

DRDU: A data reuse analysis technique for efficient scratch-pad memory management

ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 12, Issue 2Pages 15–eshttps://doi.org/10.1145/1230800.1230807

In multimedia and other streaming applications, a significant portion of energy is spent on data transfers. Exploiting data reuse opportunities in the application, we can reduce this energy by making copies of frequently used data in a small local ...

article

Runtime characterisation of irregular accesses applied to parallelisation of irregular reductions

International Journal of Computational Science and Engineering (IJCSE), Volume 1, Issue 1Pages 1–14https://doi.org/10.1504/IJCSE.2005.008906

Irregular reduction operations are the core of many large scientific and engineering applications. There are, in the literature, different methods to solve these operations in parallel. In this paper we discuss a new technique which improves performance ...

Article

Analytical computation of Ehrhart polynomials: enabling more compiler analyses and optimizations

CASES '04: Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systemsPages 248–258https://doi.org/10.1145/1023833.1023868

Many optimization techniques, including several targeted specifically at embedded systems, depend on the ability to calculate the number of elements that satisfy certain conditions. If these conditions can be represented by linear constraints, then such ...

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Also Published in:

Upcoming Conferences

Also Published in:

Also Published in: