A low energy adaptive motion estimation hardware for H.264 multiview video coding
Multiview video coding (MVC) is the process of efficiently compressing stereo (two views) or multiview video signals. The improved compression efficiency achieved by H.264 MVC comes with a significant increase in computational complexity. Temporal ...
Hardware implementation and validation of a traffic road sign detection and identification system
Reconfigurability and parallel computing capability of field programmable gate array (FPGA) devices are highly exploited in real-time digital image and video processing applications. In this field, real-time traffic road signs detection systems present ...
Flexible architectures for retinal blood vessel segmentation in high-resolution fundus images
Blood vessel segmentation from high-resolution fundus images is a necessary step in several retinal pathologies detection. Automatic blood vessel segmentation is a computing-intensive task, which raises the need for acceleration with hardware ...
Three-level pipelined multi-resolution integer motion estimation engine with optimized reference data sharing search for AVS
Integer motion estimation (IME), which acts as a key component in video encoder, is to remove temporal redundancies by searching the best integer motion vectors for dynamic partition blocks in a macro-block (MB). Huge memory bandwidth requirements and ...
Hardware implementation-oriented fast intra-coding based on downsampling information for HEVC
This paper proposes a downsampling information-based intra-coding scheme which consists of two parts, preprocessing stage and fast intra-coding stage. Three downsampling information-based fast decision algorithms are proposed in fast intra-coding stage. ...
Optimizing memory bandwidth exploitation for OpenVX applications on embedded many-core accelerators
In recent years, image processing has been a key application area for mobile and embedded computing platforms. In this context, many-core accelerators are a viable solution to efficiently execute highly parallel kernels. However, architectural ...
Accelerated image factorization based on improved NMF algorithm
Non-negative matrix factorization (NMF) is widely used in feature extraction and dimension reduction fields. Essentially, it is an optimization problem to determine two non-negative low rank matrices $$W_{m \times k}$$Wm k and $$H_{k \times n}$$Hk n for ...
Boundary correlation-based intracoding for SHVC algorithm and its efficient VLSI architecture
Scalable high-efficiency video coding (SHVC) can provide variable video quality according to terminal devices. However, a computational complexity of SHVC is increased by introducing new techniques based on high-efficiency video coding (HEVC). In this ...
Real-time hardware---software embedded vision system for ITS smart camera implemented in Zynq SoC
The article demonstrates the usefulness of heterogeneous System on Chip (SoC) devices in smart cameras used in intelligent transportation systems (ITS). In a compact, energy efficient system the following exemplary algorithms were implemented: vehicle ...
Heterogeneous SoC-based acceleration of MPEG-7 compliance image retrieval process
- Romina Molina,
- Julio Dondo Gazzano,
- Fernando Rincon,
- Veronica Gil-Costa,
- Jesus Barba,
- Ricardo Petrino,
- Juan Carlos Lopez
With the growing amount of multimedial content over the internet and broadcast systems, mechanisms for efficient information organization, manipulation and transmission are becoming indispensable. Optimization of the multimedia search and retrieval ...
Parallel Light Speed Labeling: an efficient connected component algorithm for labeling and analysis on multi-core processors
In the last decade, many papers have been published to present sequential connected component labeling (CCL) algorithms. As modern processors are multi-core and tend to many cores, designing a CCL algorithm should address parallelism and multithreading. ...
A linked list run-length-based single-pass connected component analysis for real-time embedded hardware
Conventional connected component analysis (CCA) algorithms render a slow performance in real-time embedded applications due to multiple passes to resolve label equivalences. As this fundamental task becomes crucial for stream processing, single-pass ...