Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJanuary 2018
Implicit Data-Parallelism in Kahn Process Networks: Bridging the MacQueen Gap
PARMA-DITAM '18: Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing PlatformsPages 20–25https://doi.org/10.1145/3183767.3183790Modern embedded systems are rapidly increasing their complexity, both in terms of numbers of cores, as well as heterogeneity. To generate efficient code for these systems, it is common to leverage formal models of computation. Among these, the dataflow ...
- research-articleJanuary 2018
Automatic OpenCL Code Generation from LLVM-IR using Polyhedral Optimization
PARMA-DITAM '18: Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing PlatformsPages 45–50https://doi.org/10.1145/3183767.3183779Nowadays, developers can implement applications using OpenCL for all kinds of architectures, like CPUs, GPUs and FPGAs. In this work, we propose a source-to-source compiler that can transform C/C++ source code to optimized OpenCL kernel and host code. ...
- research-articleJanuary 2018
Enabling Run-Time Managed Distributed Mobile Computing
PARMA-DITAM '18: Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing PlatformsPages 39–44https://doi.org/10.1145/3183767.3183778The increasing pervasiveness of mobile devices combined with their replacement rate, led us to deal with the disposal of an increasing amount of still working electronic devices. This work proposes an approach to mitigate this problem by extending the ...
- research-articleJanuary 2018
Impact of Vectorization Over 16-bit Data-Types on GPUs
PARMA-DITAM '18: Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing PlatformsPages 32–38https://doi.org/10.1145/3183767.3183777Since the introduction of Single Instruction Multiple Thread (SIMT) GPU architectures, vectorization has seldom been recommended. However, for efficient use of 8-bit and 16-bit data types, vector types are necessary even on these GPUs. When only integer ...
- research-articleJanuary 2018
Aspect-Driven Mixed-Precision Tuning Targeting GPUs
- Ricardo Nobre,
- Luís Reis,
- João Bispo,
- Tiago Carvalho,
- João M.P. Cardoso,
- Stefano Cherubin,
- Giovanni Agosta
PARMA-DITAM '18: Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing PlatformsPages 26–31https://doi.org/10.1145/3183767.3183776Writing mixed-precision kernels allows to achieve higher throughput together with outputs whose precision remain within given limits. The recent introduction of native half-precision arithmetic capabilities in several GPUs, such as NVIDIA P100 and AMD ...
- research-articleJanuary 2018
Managing Heterogeneous Resources in HPC Systems
- Giovanni Agosta,
- William Fornaciari,
- Giuseppe Massari,
- Anna Pupykina,
- Federico Reghenzani,
- Michele Zanella
PARMA-DITAM '18: Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing PlatformsPages 7–12https://doi.org/10.1145/3183767.3183769To sustain performance while facing always tighter power and energy envelopes, High Performance Computing (HPC) is increasingly leveraging heterogeneous architectures. This poses new challenges: to efficiently exploit the available resources, both in ...