Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ARC 2014: Towards a Fast FPGA Implementation of a Heap-Based Priority Queue for Image Coding Using a Parallel Index-Aware Tree

Published: 06 November 2015 Publication History
  • Get Citation Alerts
  • Abstract

    The embedded image processing systems like smartphones and digital cameras have tight limits on storage, computation power, network connectivity, and battery usage. These limitations make it important to ensure efficient image coding. In the article, we present a novel heap-based priority queue structure employed by an Adaptive Scanning of Wavelet Data scheme (ASWD) targeting an embedded platform. ASWD is a context modeling block implemented via priority queues in a wavelet-based image coder to reorganize the wavelet coefficients into locally stationary sequences. The architecture we propose exploits efficient use of FPGA’s on-chip dual-port memories in an adaptive manner. Innovations of index-aware system linked to each element in the queue makes the location of queue element traceable in the heap as per the requirements of the ASWD algorithm. Moreover, use of 4-port memories along with intelligent data concatenation of queue elements yielded in a cost effective enhanced memory access. The memory ports are adaptively assigned to different units during different processing phases in a manner to optimally take advantage of memory access required by that phase. The architectural innovations can also be exploited in other applications that require efficient hardware implementations of generic priority queue or classical sorting applications which sort into the index. We designed and validated the hardware on an Altera’s Stratix IV FPGA as an IP accelerator in a Nios II processor based System on Chip. We show that our architecture at 150MHz can provide 45X speedup compared to an embedded ARM Cortex-A9 processor at 666MHz targeting the throughput of 10MB/s.

    References

    [1]
    Michael Adams. 2014. JasPer Project. Retrieved from http://www.ece.uvic.ca/∼frodo/jasper/.
    [2]
    Yuhui Bai, Syed Zahid Ahmed, and Bertrand Granado. 2013. FPGA implementation of hierarchical enumerative coding for locally stationary image source. In Field Programmable Logic and Applications. IEEE, 1--6.
    [3]
    Ranjita Bhagwan and Bill Lin. 2000. Fast and scalable priority queue architecture for high-speed network switches. In Proceedings of INFOCOM’00, Vol. 2. IEEE, 538--547.
    [4]
    Albert Cohen, Ingrid Daubechies, and J.-C. Feauveau. 1992. Biorthogonal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics 45, 5 (1992), 485--560.
    [5]
    Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein, et al. 2001. Introduction to Algorithms. Vol. 2. MIT press Cambridge.
    [6]
    Kaisa Haapala, Ville Lappalainen, and Timo D. Hämäläinen. 2005. Experimental parallel implementation of a wavelet-based still image encoder. Microprocessors and Microsystems 29, 4 (2005), 155--167.
    [7]
    Alain Hore and Djemel Ziou. 2010. Image quality metrics: PSNR vs. SSIM. In International Conference on Pattern Recognition (ICPR). IEEE, 2366--2369.
    [8]
    Shih-Ta Hsiang. 2001. Embedded image coding using zeroblocks of subband/wavelet coefficients and context modeling. In Proceedings of the 2001 Data Compression Conference (DCC’01). IEEE, 83--92.
    [9]
    A. Ioannou and M. G. H. Katevenis. 2007. Pipelined heap (priority queue) management for advanced scheduling in high-speed networks. IEEE/ACM Transactions on Networking (TON) 15, 2 (2007), 450--461.
    [10]
    Kakadu. 2014. Kakadu Software. Retrieved from http://www.kakadusoftware.com.
    [11]
    Rui Marcelino, Horácio C. Neto, and João M. P. Cardoso. 2009. A comparison of three representative hardware sorting units. In Proceedings of the Industrial Electronics Conference (IECON’09). IEEE, 2805--2810.
    [12]
    Detlev Marpe, Heiko Schwarz, and Thomas Wiegand. 2003. Context-based adaptive binary arithmetic coding in the H. 264/AVC video compression standard. IEEE Transactions on Circuits and Systems for Video Technology 13, 7 (2003), 620--636.
    [13]
    Ioannis Mavroidis. 1998. Heap Management in Hardware. Tech. Rep. FORTH-CS/TR-222. Institute of Computer Science, Crete, Greece.
    [14]
    L. Öktem. November 1999. Hierarchical Enumerative Coding and Its Applications in Image Compression. Ph.D. Dissertation. Tampere University of Technology.
    [15]
    Levent Öktem and Jaakko Astola. 1999. Hierarchical enumerative coding of locally stationary binary data. Electronics Letters 35, 17 (1999), 1428--1429.
    [16]
    N. Rajovic, N. Puzovic, L. Vilanova, C. Villavieja, and A. Ramirez. 2011. Energy efficient computing on. Embedded and Mobile devices. In Proceedings of the GPU Technology Conference (SC’11).
    [17]
    Robert Sedgewick and Kevin Wayne. 2011. Algorithms (4th ed.). Addison-Wesley Professional. 308--335.
    [18]
    S. D. Servetto and K. Ramhandran. Sep. 1999. Image coding based on a morphological representation of wavelet data. IEEE Transactions on Image Processing 8, 9 (Sep. 1999), 1161--1174.
    [19]
    J. M. Shapiro. 1993. Embedded image coding using zerotrees of wavelet coefficients. IEEE Transactions on Signal Processing 41 (1993), 3445--3462.
    [20]
    Muneyoshi Suzuki and Katsuya Minami. 2009. Concurrent heap-based network sort engine-toward enabling massive and high speed per-flow queuing. In Proceedings of ICC’09. IEEE, 1--6.
    [21]
    Terasic Technologies. 2012. DE4 User Manual. http://www.terasic.com.tw/.
    [22]
    D. Vatolin, A. Moskvin, O. Petrov, and N. Trunichkin. 2005. JPEG 2000 Image Codecs Comparison. Retrieved from http://compression.ru/video/codec_comparison/pdf/jpeg2000_codec_comparison_en.pdf.
    [23]
    Zhou Wang and Alan C. Bovik. 2009. Mean squared error: Love it or leave it? A new look at signal fidelity measures. IEEE Signal Processing Magazine 26, 1 (2009), 98--117.
    [24]
    Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600--612.
    [25]
    Wojciech M. Zabołotny. 2011. Dual port memory based heapsort implementation for fpga. In Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2011. 80080E.

    Cited By

    View all
    • (2021)LIONProceedings of the 19th ACM-IEEE International Conference on Formal Methods and Models for System Design10.1145/3487212.3487349(32-43)Online publication date: 20-Nov-2021

    Index Terms

    1. ARC 2014: Towards a Fast FPGA Implementation of a Heap-Based Priority Queue for Image Coding Using a Parallel Index-Aware Tree

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Reconfigurable Technology and Systems
        ACM Transactions on Reconfigurable Technology and Systems  Volume 9, Issue 1
        Special Section on the 2014 International Symposium on Applied Reconfigurable Computing
        November 2015
        121 pages
        ISSN:1936-7406
        EISSN:1936-7414
        DOI:10.1145/2839314
        • Editor:
        • Steve Wilton
        Issue’s Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 06 November 2015
        Accepted: 01 April 2015
        Revised: 01 March 2015
        Received: 01 July 2014
        Published in TRETS Volume 9, Issue 1

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. FPGA
        2. Image compression
        3. adaptive scanning
        4. embedded system
        5. heapsort
        6. priority queue
        7. system-on-chip

        Qualifiers

        • Research-article
        • Research
        • Refereed

        Funding Sources

        • Fonds Européen de Développement Regional (FEDER/FUI)

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)8
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 12 Aug 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2021)LIONProceedings of the 19th ACM-IEEE International Conference on Formal Methods and Models for System Design10.1145/3487212.3487349(32-43)Online publication date: 20-Nov-2021

        View Options

        Get Access

        Login options

        Full Access

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media