Keyword: profiler : Search

Applied Filters

People

Publications

Conferences

Publication Date

22 Results for: Keyword: profilerEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,723,146 records)|Limit your search to The ACM Full-Text Collection (748,296 records)

Showing 1 - 20of22 Results

Filters

Select All

Export Citations Save to Binder

per page:

Relevance

research-article
Public Access
April 2023
DrGPU: A Top-Down Profiler for GPU Applications
ICPE '23: Proceedings of the 2023 ACM/SPEC International Conference on Performance EngineeringApril 2023, Pages 43–53https://doi.org/10.1145/3578244.3583736

GPUs have become common in HPC systems to accelerate scientific computing and machine learning applications. Efficiently mapping these applications to rapid evolutions of GPU architectures for high performance is a well-known challenge. Various ...
4
402
Metrics
Total Citations4
Total Downloads402
Last 12 Months308
Last 6 weeks74
View online with eReader
PDF
research-article
Open Access
October 2022
qprof: A gprof-Inspired Quantum Profiler
ACM Transactions on Quantum Computing (TQC), Volume 4, Issue 1Article No.: 4, Pages 1–28https://doi.org/10.1145/3529398
We introduce qprof, a new and extensible quantum program profiler able to generate profiling reports of quantum circuits written using various quantum computing frameworks. We describe the internal structure and working of qprof and provide practical ...
4
961
Metrics
Total Citations4
Total Downloads961
Last 12 Months403
Last 6 weeks39
View online with eReader
PDF
research-article
Public Access
June 2021
NumaPerf: predictive NUMA profiling
- Xin Zhao,
- Jin Zhou,
- Hui Guan,
- Wei Wang,
- Xu Liu,
- Tongping Liu
ICS '21: Proceedings of the 35th ACM International Conference on SupercomputingJune 2021, Pages 52–62https://doi.org/10.1145/3447818.3460361

It is extremely challenging to achieve optimal performance of parallel applications on a NUMA architecture, which necessitates the assistance of profiling tools. However, existing NUMA-profiling tools share some similar shortcomings, such as portability, ...
7
389
Metrics
Total Citations7
Total Downloads389
Last 12 Months103
Last 6 weeks13
View online with eReader
PDF
demonstration
October 2020
A profiler for the matching process of henshin
MODELS '20: Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems: Companion ProceedingsOctober 2020, Article No.: 3, Pages 1–5https://doi.org/10.1145/3417990.3422000

Model transformations are essential operations in Model-Driven Software Engineering (MDSE). Due to the increasing size and complexity of software systems developed with the help of MDSE, the input models for transformations are also getting bigger. In ...
3
102
Metrics
Total Citations3
Total Downloads102
Last 12 Months22
Last 6 weeks3
Get Access
research-article
Public Access
June 2020
Tools for top-down performance analysis of GPU-accelerated applications
ICS '20: Proceedings of the 34th ACM International Conference on SupercomputingJune 2020, Article No.: 26, Pages 1–12https://doi.org/10.1145/3392717.3392752

This paper describes extensions to Rice University's HPCToolkit performance tools to support measurement and analysis of GPU-accelerated applications. To help developers understand the performance of accelerated applications as a whole, HPCToolkit's ...
13
1,247
Metrics
Total Citations13
Total Downloads1,247
Last 12 Months547
Last 6 weeks123
View online with eReader
PDF
Upcoming Conferences
Skip slideshow

MODELS '24

September 22 - 27, 2024

Johannes Kepler University Linz, Linz, Austria

MODELS '24 Website

PACT '24

October 14 - 16, 2024

Hilton Long Beach, Long Beach, CA, USA

PACT '24 Website

SPLASH '24

October 20 - 25, 2024

Hilton Pasadena, Pasadena, CA, USA

SPLASH '24 Website

MIDDLEWARE '24

December 2 - 6, 2024

The Hong Kong Polytechnic University, Hong Kong, Hong Kong

MIDDLEWARE '24 Website
short-paper
April 2020
GAPP: A Fast Profiler for Detecting Serialization Bottlenecks in Parallel Linux Applications
- Reena Nair,
- Tony Field
ICPE '20: Proceedings of the ACM/SPEC International Conference on Performance EngineeringApril 2020, Pages 257–264https://doi.org/10.1145/3358960.3379136

We present a parallel profiling tool, GAPP, that identifies serialization bottlenecks in parallel Linux applications arising from load imbalance or contention for shared resources . It works by tracing kernel context switch events using kernel probes ...
0
231
Metrics
Total Citations0
Total Downloads231
Last 12 Months21
Last 6 weeks3
Get Access
poster
February 2020
A tool for top-down performance analysis of GPU-accelerated applications
PPoPP '20: Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingFebruary 2020, Pages 415–416https://doi.org/10.1145/3332466.3374534

To support performance measurement and analysis of GPU-accelerated applications, we extended the HPCToolkit performance tools with several novel features. To support efficient monitoring of accelerated applications, HPCToolkit employs a new wait-free ...
5
268
Metrics
Total Citations5
Total Downloads268
Last 12 Months10
Last 6 weeks0
Get Access
research-article
December 2019
Profiling Dynamic Data Access Patterns with Controlled Overhead and Quality
Middleware '19: Proceedings of the 20th International Middleware Conference Industrial TrackDecember 2019, Pages 1–7https://doi.org/10.1145/3366626.3368125

Modern workloads tend to have huge working sets and low locality. Despite this trend, the capacity of DRAM has not been increased enough to accommodate such huge working sets. Therefore, memory management mechanisms optimized for such modern workloads ...
10
673
Metrics
Total Citations10
Total Downloads673
Last 12 Months101
Last 6 weeks16
Get Access
research-article
September 2019
Profiling Halide DSL with CPU Performance Events for Schedule Optimization
SBLP '19: Proceedings of the XXIII Brazilian Symposium on Programming LanguagesSeptember 2019, Pages 38–45https://doi.org/10.1145/3355378.3355381

Halide is a domain-specific language (DSL) for image processing that enforces a separation of the algorithm and the execution schedule, allowing the generation of specialized code for distinct computer architectures by rewriting only the execution ...
1
71
Metrics
Total Citations1
Total Downloads71
Last 12 Months2
Last 6 weeks0
Get Access
research-article
November 2015
DAGViz: a DAG visualization tool for analyzing task-parallel program traces
VPA '15: Proceedings of the 2nd Workshop on Visual Performance AnalysisNovember 2015, Article No.: 3, Pages 1–8https://doi.org/10.1145/2835238.2835241

In task-based parallel programming, programmers can expose logical parallelism of their programs by creating fine-grained tasks at arbitrary places in their code. All other burdens in the parallel execution of these tasks such as thread management, task ...
15
305
Metrics
Total Citations15
Total Downloads305
Last 12 Months31
Last 6 weeks6
Get Access
research-article
August 2015
JITProf: pinpointing JIT-unfriendly JavaScript code
ESEC/FSE 2015: Proceedings of the 2015 10th Joint Meeting on Foundations of Software EngineeringAugust 2015, Pages 357–368https://doi.org/10.1145/2786805.2786831

Most modern JavaScript engines use just-in-time (JIT) compilation to translate parts of JavaScript code into efficient machine code at runtime. Despite the overall success of JIT compilers, programmers may still write code that uses the dynamic ...
41
489
Metrics
Total Citations41
Total Downloads489
Last 12 Months46
Last 6 weeks3
Get Access
research-article
February 2014
A tool to analyze the performance of multithreaded programs on NUMA architectures
- Xu Liu,
- John Mellor-Crummey
PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programmingFebruary 2014, Pages 259–272https://doi.org/10.1145/2555243.2555271

Almost all of today's microprocessors contain memory controllers and directly attach to memory. Modern multiprocessor systems support non-uniform memory access (NUMA): it is faster for a microprocessor to access memory that is directly attached than it ...
Also Published in:
ACM SIGPLAN Notices: Volume 49 Issue 8August 2014
60
1,272
Metrics
Total Citations60
Total Downloads1,272
Last 12 Months62
Last 6 weeks6
Get Access
Article
October 2013
High-Level GPU Multi-purpose Profiler
- Marian-Cristian Rotariu,
- Elena Apostol
3PGCIC '13: Proceedings of the 2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet ComputingOctober 2013, Pages 549–553https://doi.org/10.1109/3PGCIC.2013.94

The graphics processing units (GPUs) have become an integral part of today's computing systems. They have risen and evolved over the last years, becoming a platform for parallel computation with a large number of scalar processors and abundant memory ...
0
Metrics
Total Citations0
research-article
September 2012
Visualizing transactional memory
PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniquesSeptember 2012, Pages 159–170https://doi.org/10.1145/2370816.2370842

This paper presents TMProf, a transactional memory (TM) profiler, based on three visualization principles. These principles are (i) the precise graphical representation of transaction interactions including cross-correlated information and source code, (...
9
308
Metrics
Total Citations9
Total Downloads308
Last 12 Months3
Last 6 weeks2
Get Access
research-article
February 2012
The profiling method in multicore processor for effective performance improvement
ICUIMC '12: Proceedings of the 6th International Conference on Ubiquitous Information Management and CommunicationFebruary 2012, Article No.: 82, Pages 1–4https://doi.org/10.1145/2184751.2184848

Today, multi-core processors are being used widely in mobile environments in addition to the existing PC-based environment. In order to use a multi-core processor efficiently, parallel programming skills are required. However, incorrect parallelization ...
0
219
Metrics
Total Citations0
Total Downloads219
Last 12 Months2
Last 6 weeks0
Get Access
poster
February 2011
Kremlin: like gprof, but for parallelization
PPoPP '11: Proceedings of the 16th ACM symposium on Principles and practice of parallel programmingFebruary 2011, Pages 293–294https://doi.org/10.1145/1941553.1941595

This paper overviews Kremlin, a software profiling tool designed to assist the parallelization of serial programs. Kremlin accepts a serial source code, profiles it, and provides a list of regions that should be considered in parallelization. Unlike a ...
Also Published in:
ACM SIGPLAN Notices: Volume 46 Issue 8August 2011
15
268
Metrics
Total Citations15
Total Downloads268
Last 12 Months8
Last 6 weeks2
Get Access
Article
September 2010
Generated Cycle-Accurate Profiler for C Language
DSD '10: Proceedings of the 2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and ToolsSeptember 2010, Pages 263–268https://doi.org/10.1109/DSD.2010.39

Application-specific instruction set processors used in embedded systems are highly optimized for a given task. On this type of processors runs a specific application. Therefore, the designer should have a tool which helps him or her in the task of ...
0
Metrics
Total Citations0
poster
October 2009
The observer effect of profiling on dynamic Java optimizations
OOPSLA '09: Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applicationsOctober 2009, Pages 757–758https://doi.org/10.1145/1639950.1640000

We show that the bytecode injection approach used in common Java profilers, such as HPROF and JProfiler, disables some program optimizations that are performed when the same program is running without a profiler. This behavior is present in both the ...
1
212
Metrics
Total Citations1
Total Downloads212
Last 12 Months9
Last 6 weeks0
Get Access
research-article
October 2008
Profiler and compiler assisted adaptive I/O prefetching for shared storage caches
PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniquesOctober 2008, Pages 112–121https://doi.org/10.1145/1454115.1454133

I/O prefetching has been employed in the past as one of the mechanisms to hide large disk latencies. However, I/O prefetching in parallel applications is problematic when multiple CPUs share the same set of disks due to the possibility that prefetches ...
11
300
Metrics
Total Citations11
Total Downloads300
Last 12 Months6
Last 6 weeks3
Get Access
Article
April 1997
Knowledge discovery from users Web-page navigation
RIDE '97: Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale ApplicationsApril 1997, Page 20

The authors propose to detect users' navigation paths to the advantage of Web site owners. First, they explain the design and implementation of a profiler which captures a client's selected links and page order, accurate page viewing time and cache ...
55
Metrics
Total Citations55