Issue Downloads
Hardware architectural support for control systems and sensor processing
The field of modern control theory and the systems used to implement these controls have shown rapid development over the last 50 years. It was often the case that those developing control algorithms could assume the computing medium was solely ...
Multicore-based vector coprocessor sharing for performance and energy gains
For most of the applications that make use of a dedicated vector coprocessor, its resources are not highly utilized due to the lack of sustained data parallelism which often occurs due to vector-length variations in dynamic environments. The motivation ...
A systematic approach for optimized bypass configurations for application-specific embedded processors
The diversity of today's mobile applications requires embedded processor cores with a high resource efficiency, that means, the devices should provide a high performance at low area requirements and power consumption. The fine-grained parallelism ...
Custom architecture for multicore audio beamforming systems
The audio Beamforming (BF) technique utilizes microphone arrays to extract acoustic sources recorded in a noisy environment. In this article, we propose a new approach for rapid development of multicore BF systems. Research on literature reveals that ...
Design-space exploration and runtime resource management for multicores
Application-specific multicore architectures are usually designed by using a configurable platform in which a set of parameters can be tuned to find the best trade-off in terms of the selected figures of merit (such as energy, delay, and area). This ...
Memory performance estimation of CUDA programs
CUDA has successfully popularized GPU computing, and GPGPU applications are now used in various embedded systems. The CUDA programming model provides a simple interface to program on GPUs, but tuning GPGPU applications for high performance is still ...
Parallel architectures for the kNN classifier -- design of soft IP cores and FPGA implementations
We designed a variety of k-nearest-neighbor parallel architectures for FPGAs in the form of parameterizable soft IP cores. We show that they can be used to solve large classification problems with thousands of training vectors, or thousands of vector ...
Automatic synthesis of physical system differential equation models to a custom network of general processing elements on FPGAs
Fast execution of physical system models has various uses, such as simulating physical phenomena or real-time testing of medical equipment. Physical system models commonly consist of thousands of differential equations. Solving such equations using ...
LegUp: An open-source high-level synthesis tool for FPGA-based processor/accelerator systems
- Andrew Canis,
- Jongsok Choi,
- Mark Aldham,
- Victor Zhang,
- Ahmed Kammoona,
- Tomasz Czajkowski,
- Stephen D. Brown,
- Jason H. Anderson
It is generally accepted that a custom hardware implementation of a set of computations will provide superior speed and energy efficiency relative to a software implementation. However, the cost and difficulty of hardware design is often prohibitive, ...
Efficient compilation of CUDA kernels for high-performance computing on FPGAs
The rise of multicore architectures across all computing domains has opened the door to heterogeneous multiprocessors, where processors of different compute characteristics can be combined to effectively boost the performance per watt of different ...