Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3120895acmotherconferencesBook PagePublication PagesheartConference Proceedingsconference-collections
HEART '17: Proceedings of the 8th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies
ACM2017 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
HEART2017: The 8th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies Bochum Germany June 7 - 9, 2017
ISBN:
978-1-4503-5316-8
Published:
07 June 2017
In-Cooperation:
Ruhr-Universität Bochum
Recommend ACM DL
ALREADY A SUBSCRIBER?SIGN IN

Abstract

No abstract available.

Skip Table Of Content Section
SESSION: Architecture & System I
research-article
An FPGA NIC Based Hardware Caching for Blockchain
Article No.: 1, Pages 1–6https://doi.org/10.1145/3120895.3120897

These days, people pay attention to Blockchain, which is a main technology of cryptocurrency. Blockchain is a fault-tolerant distributed ledger that does not need an administrator. We call transfer of digital asset as a "transaction". We need to hold ...

research-article
High Speed Performance Estimation of Embedded Hard-core Processors in FPGA-based SoCs
Article No.: 2, Pages 1–6https://doi.org/10.1145/3120895.3120902

The embedded hard-core processors beside the traditional FPGA fabric in FPGA-based System-on-Chip (SoC) devices make them an attractive alternative for realizing the software portions of the application while using the FPGA fabric for hardware ...

research-article
A Time-Division Multiplexing Ising Machine on FPGAs
Article No.: 3, Pages 1–6https://doi.org/10.1145/3120895.3120905

Annealing machines based on the Ising model which can solve combinatorial optimization problems is an emerging solution to overcome the performance limit of von Neumann architecture. However, it is difficult to solve practical combinatorial optimization ...

research-article
An Adaptive Demotion Policy for High-Associativity Caches
Article No.: 4, Pages 1–6https://doi.org/10.1145/3120895.3120906

Although the Least Recently Used (LRU) policy is known as a simple but high-performance cache replacement policy, high-associativity caches hardly adopt the LRU policy because of an increase in the hardware overheads. The Re-Reference Interval ...

SESSION: Design Methodology & Tools I
research-article
Towards Flexible Automatic Generation of Graph Processing Gateware
Article No.: 5, Pages 1–6https://doi.org/10.1145/3120895.3120896

FPGAs have been demonstrated as promising platforms to accelerate graph processing applications at scale with superior energy-efficiency. However, programming FPGAs is significantly more challenging than similar software solutions. To address this ...

research-article
Dataflow based Near Data Computing Achieves Excellent Energy Efficiency
Article No.: 6, Pages 1–6https://doi.org/10.1145/3120895.3120900

The emergence of 3D-DRAM has rekindled interest in near data computing (NDC) research. This article introduces dataflow processing in memory (DFPIM) which melds near data computing, dataflow architecture, coarse-grained reconfigurable logic (CGRL), and ...

research-article
DTP: Enabling Exhaustive Exploration of FPGA Temporal Partitions for Streaming HPC Applications
Article No.: 7, Pages 1–11https://doi.org/10.1145/3120895.3120901

Reconfigurable computing systems show great promise for accelerating streaming HPC applications because of their low power consumption and high performance. However, mapping an HPC application to a reconfigurable system is a challenging task. The ...

research-article
Hardware Acceleration with Multi-Threading of Java-Based High Level Synthesis Tool
Article No.: 8, Pages 1–6https://doi.org/10.1145/3120895.3120912

In this research, we attempt to speed up the computational fluid dynamics (CFD) and the convolutional neural network (CNN) using JavaRock-Thrash thread function of the high-level synthesis tool with an FPGA. In the two-dimensional heat equation, by ...

research-article
Performance Evaluation of PEACH3: Field-Programmable Gate Array Switch for Tightly Coupled Accelerators
Article No.: 9, Pages 1–6https://doi.org/10.1145/3120895.3120911

An FPGA switching hub for tightly coupled accelerators (TCA) architecture called PEACH3 (PCI-Express Adaptive Communication Hub ver. 3) is evaluated and its communication speed is analyzed. PEACH3 connects a number of GPUs directly through PCI express ...

SESSION: Application I
research-article
Accelerated Embedded AKAZE Feature Detection Algorithm on FPGA
Article No.: 10, Pages 1–6https://doi.org/10.1145/3120895.3120898

Feature detection is a major operation in various computer vision systems. The KAZE algorithm and its improved version, Accelerated-KAZE (AKAZE), are considered as the first algorithms to detect features by building a scale space using nonlinear ...

research-article
Reducing the Cost of Removing Border Artefacts in Fourier Transforms
Article No.: 11, Pages 1–6https://doi.org/10.1145/3120895.3120899

Many image processing algorithms are implemented in a combination of spatial and frequency domains. The fast Fourier transform (FFT) is the workhorse of such algorithms. One limitation of the FFT is artefacts that result from the implicit periodicity ...

research-article
A porting and optimization of search for neighbour-particle in MPS method for GPU by using OpenACC
Article No.: 12, Pages 1–6https://doi.org/10.1145/3120895.3120903

Moving Particle Semi-implicit (MPS) method is a particle method used in fields such as computational fluid dynamics. It is classified as a particle method. Target fluids and objects are divided up into particles, and each particle interacts with its ...

research-article
Open Access
Acceleration of Publish/Subscribe Messaging in ROS-compliant FPGA Component
Article No.: 13, Pages 1–6https://doi.org/10.1145/3120895.3120904

Intelligent robots demand complex information processing such as SLAM (Simultaneous Localization and Mapping) and DNN (Deep Neural Network). FPGA (Field Programmable Gate Array) is expected to accelerate these applications with high energy efficiency. ...

SESSION: Architecture & Applications
research-article
research-article
HW/SW Co-design of an IEEE 802.11a/g Receiver on Xilinx Zynq SoC using High-Level Synthesis
Article No.: 15, Pages 1–6https://doi.org/10.1145/3120895.3120908

This paper presents an implementation of an Orthogonal Frequency-Division Multiplexing (OFDM) receiver using the high-level synthesis tool, from Xilinx called Software Defined System-on-Chip (SDSoC). The Zynq SoCs containing an ARM processor besides a ...

research-article
FPGA-based Stream Computing for High-Performance N-Body Simulation using Floating-Point DSP Blocks
Article No.: 16, Pages 1–6https://doi.org/10.1145/3120895.3120909

Recent advancement of FPGAs allows high-performance and low-power computing by constructing deeply-pipelined custom hardware using floating-point DSP blocks. In this paper, we present a stream-computing architecture and design for FPGA-based high-...

research-article
Neural Network Training Acceleration with PSO Algorithm on a GPU Using OpenCL
Article No.: 17, Pages 1–6https://doi.org/10.1145/3120895.3120910

Neural networks and deep learning currently provide the promising solutions to many practical problems. One of the difficulties in building neural network models is the training process that requires to find an optimal solution for the network weights. ...

research-article
FPGA based ASIC Emulator with High Speed Optical Serial Links
Article No.: 18, Pages 1–6https://doi.org/10.1145/3120895.3120913

We propose a multiple FPGA system using high speed optical serial interface built in recent FPGAs and construct ASIC emulator. Although conventional system which uses parallel connection is limited to bandwidth of the number of I/Os, proposed system has ...

POSTER SESSION: Poster Session I
research-article
A Case for Remote GPUs over 10GbE Network for VR Applications
Article No.: 19, Pages 1–6https://doi.org/10.1145/3120895.3120914

VR technology that enables users to experience environments made by computer similar to real environments has become popular. In VR technology, computation cost of graphic processing is high and thus it requires a high-end GPU because high-quality ...

research-article
Acceleration of the aggregation process in a Hall-thruster simulation using Intel FPGA SDK for OpenCL
Article No.: 20, Pages 1–6https://doi.org/10.1145/3120895.3120915

The Full Particle-In-Cell (Full-PIC) method is a numerical simulation technique used in the research and development of Hall-thrusters which are a type of electric propulsion engines. It treats ions, neutrons, and electrons as particles and is highly ...

research-article
FPGA Accelerated NoC-Simulation: A Case Study on the Intel Xeon Phi Ringbus Topology
Article No.: 21, Pages 1–6https://doi.org/10.1145/3120895.3120916

Complex signal processing algorithms targeted on architectures with increasingly high numbers of parallel processing units require high performance core-interconnections (i.e., low latencies, high throughput, no pinch-offs or bottlenecks). Therefore, ...

research-article
Performance Evaluation of a CPU-FPGA Hybrid Cluster Platform Prototype
Article No.: 22, Pages 1–6https://doi.org/10.1145/3120895.3120917
research-article
High-level Synthesis based on Parallel Design Patterns using a Functional Language
Article No.: 23, Pages 1–6https://doi.org/10.1145/3120895.3120918

Logic-circuit integration of a field-programmable gate array (FPGA) has grown considerably with improvements in semiconductor technology. High-level synthesis (HLS) is now widely used to implement complex FPGA applications to increase design efficiency, ...

POSTER SESSION: Poster Session II
research-article
High-Performance Hardware Accelerators for Solving Ordinary Differential Equations
Article No.: 24, Pages 1–6https://doi.org/10.1145/3120895.3120919

Ordinary Differential Equations (ODEs) are widely used in many high-performance computing applications. However, contemporary processors generally provide limited throughput for these kinds of calculations. A high-performance hardware accelerator has ...

research-article
Access Network Generation for Efficient Debugging of FPGAs
Article No.: 25, Pages 1–6https://doi.org/10.1145/3120895.3120920

The inclusion of access networks in modern FPGAs can provide a large number of use cases notably in debugging. Using access networks can eliminate the need for frequent synthesis during the debugging phase, which results in saving debugging time and ...

research-article
Probabilistic Strategies Based on Staged LSH for Speedup of Audio Fingerprint Searching with Ten Million Scale Database
Article No.: 26, Pages 1–6https://doi.org/10.1145/3120895.3120921

We are developing and improving algorithms to identify audio fingerprints (AFP) in a network router. Staged Locality Sensitive Hashing (LSH) is one of them and nearly as fast as 1Gbps of prevalent network routers. In this paper, we propose two ...

research-article
HLS Compilation for CPU Interlays
Article No.: 27, Pages 1–6https://doi.org/10.1145/3120895.3120922

The idea of coupling reconfigurable fabrics with general-purpose processors has been extensively studied during the last couple of decades. Custom instructions targeting those reconfigurable fabrics had to be handcrafted because tools capable of high ...

Index terms have been assigned to the content through auto-classification.

Recommendations

Acceptance Rates

Overall Acceptance Rate 22 of 50 submissions, 44%
YearSubmittedAcceptedRate
HEART '22211048%
HEART '19291241%
Overall502244%