The International Workshop for OpenCL (IWOCL, which is pronounced "eye-wok-ul") was conceived in a meeting between Simon McIntosh-Smith and Ben Bergen at the Los Alamos National Laboratory on May 8th 2012. McIntosh-Smith and Bergen lamented that there were no organized workshops or meetings for the rapidly growing OpenCL community. After testing this idea with colleagues over the next few months, they decided to create the kind of OpenCL conference they wanted to go to themselves, and thus IWOCL was born.
Proceeding Downloads
Automatic Test Case Reduction for OpenCL
We report on an extension to the C-Reduce tool, for automatic reduction of C test cases, to handle OpenCL kernels. This enables an automated method for detecting bugs in OpenCL compilers, by generating large random kernels using the CLsmith generator, ...
The Hitchhiker's Guide to Cross-Platform OpenCL Application Development
One of the benefits to programming of OpenCL is platform portability. That is, an OpenCL program that follows the OpenCL specification should, in principle, execute reliably on any platform that supports OpenCL. To assess the current state of OpenCL ...
OpenCL-Based Mobile GPGPU Benchmarking: Methods and Challenges
Benchmarking general-purpose computing on graphics processing unit (GPGPU) aims to profile and compare performance across different devices. Due to the low-level nature of most GPGPU APIs, GPGPU benchmarks are also useful for architectural exploration ...
OpenCL Compiler Tools for FPGAs
Compiling OpenCL kernels to FPGAs presents a new set of usability challenges. Many OpenCL users are not hardware experts but are creating state-of-the-art hardware with the help of OpenCL compilers for FPGAs. To get great performance the compiler has to ...
Optimizing OpenCL applications on Xilinx FPGA
In this presentation we focus on current Xilinx FPGA (Field-Programmable Gate Array) platforms with the SDAccel OpenCL environment. FPGA have the unique feature of a reconfigurable architecture by opposition to CPU, GPU or DSP which have a fixed ...
VisionCPP: A SYCL-based Computer Vision Framework
Using computer vision techniques for system-on-chip (SoC) technologies raises performance portability and stringent memory and communication issues for vision applications.
Although high-level libraries like OpenCV abstract both the system-level and ...
clSPARSE: A Vendor-Optimized Open-Source Sparse BLAS Library
Sparse linear algebra is a cornerstone of modern computational science. These algorithms ignore the zero-valued entries found in many domains in order to work on much larger problems at much faster rates than dense algorithms. Nonetheless, optimizing ...
OpenCL caffe: Accelerating and enabling a cross platform machine learning framework
Deep neural networks (DNN) achieved significant breakthrough in vision recognition in 2012 and quickly became the leading machine learning algorithm in Big Data based large scale object recognition applications. The successful deployment of DNN based ...
Intel® Threading Building Block (Intel® TBB) flow graph as a software infrastructure layer for OpenCL™-based computations
Modern computing systems are becoming heterogeneous with a variety of programmable units: CPU, GPU, FPGA, domain-specific accelerators, etc. OpenCL™ API is a cross-platform programming model for a wide range of computing devices, but using the language ...
Optimizing convolutional neural networks on embedded platforms with OpenCL
We invite the community to collaboratively design and optimize convolutional neural networks to meet the performance, accuracy and cost requirements for deployment on a range of form factors -- from sensors to self-driving cars.
GPU Daemon: Road to zero cost submission
In this paper we present a novel approach of utilizing new features of OpenCL 2.0: Fine-Grained SVM and device-side enqueue that allow completely new usage models and application paradigms. We present the idea of a GPU (Graphics Processing Unit) daemon ...
OpenCL™ FFT Optimizations for Intel® Processor Graphics
In this paper, we explore a number of OpenCL™ optimization strategies and show the pros and cons relative to clFFT, the leading OpenCL Fast Fourier Transform (FFT) library. We implemented a 1D, multi-kernel, mixed-radix Cooley-Tukey power of two ...
The OpenCL Library Ecosystem: Current Status and Future Perspectives
OpenCL as an open standard for parallel programming of heterogeneous systems seems to be an attractive choice for software library implementations. Indeed, iwocl.org1 lists 83 OpenCL-enabled libraries as of February 12, 2016, suggesting a healthy ...
hiCL: an OpenCL abstraction layer for scientific computing, application to depth imaging on GPU and APU
Hardware accelerators (HWAs), such as Graphics Processing Units (GPUs) have proven their potential to boost scientific applications performance and have been widely embraced by academia and industry. The OpenCL programming model ensures portability on ...
Boost.Compute: A parallel computing library for C++ based on OpenCL
Boost.Compute is a powerful C++ header-only template library for parallel computing based on OpenCL. It has a layered architecture and acts both as a thin C++ wrapper over the OpenCL API and as a feature-rich interface to high-level constructs that ...
C++ for OpenCL Workshop, IWOCL 2016
OpenCL™ is an open, royalty-free standard for heterogenous parallel programming. As the number of OpenCL™ platforms is increasing, the requests for better programmability and adoption of modern C++ paradigms is growing bigger. The C++ language is ...
Extending Paralldroid for the Automatic Generation of OpenCL Code
The evolution of many of today's ubiquitous technologies has been possible due to the System on Chip (SoC) technologies. This evolution has triggered an increase of the computing power of hand-held devices, that comes from heterogeneous architectures ...
OpenCL meets Open Source Streaming Analytics
OpenCL is leveraged to build a flexible, scalable streaming analytics platform using FPGA. The end to end solution can demonstrate over a 2x price/performance advantage over the software baseline. All this can be customized by an application developer ...
Towards Interactive Visual Exploration of Parallel Programs using a Domain-Specific Language
The use of GPUs and the massively parallel computing paradigm have become wide-spread. We describe a framework for the interactive visualization and visual analysis of the run-time behavior of massively parallel programs, especially OpenCL kernels. This ...
Benchmarking, autotuning and crowdtuning OpenCL programs using the Collective Knowledge framework
Autotuning is a popular technique to ensure performance portability for important algorithms such as BLAS, FFT and DNN across the ever evolving software and hardware stack. Unfortunately, when performed on a single machine, autotuning can explore only a ...
Runtime comparison solving Gray-Scott equation on different OpenCL devices
n example of a reaction-diffusion equation with chaotic solutions. You can expect patterns to emerge from chaos. A uniformly discretization in space and periodic boundary conditions allows the Fast Fourier Transform to be used, so that when coupled with ...
C++ Classes and Templates for OpenCL Kernels with PATOS
We present PATOS, a CLANG-based source-to-source compiler to extend the OpenCL kernel language with C++ classes and template types for classes and functions. The generated code is standard conforming OpenCL-C which is usable with unmodified OpenCL ...