Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Reconfigurable hardware solution to parallel prefix computation

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This paper presents the design and implementation of an efficient reconfigurable parallel prefix computation hardware on field-programmable gate arrays (FPGAs). The design is based on a pipelined dataflow algorithm, and control logic is added to reconfigure the system for arbitrary parallelism degree. The system receives multiple input streams of elements in parallel and produces output streams in parallel. It has an advantage of controlling the degree of parallelism explicitly at run time. The time complexity of the design is O(d+(Nd)/d), where d and N are parallelism degree and stream size, respectively. When the stream size is sufficiently larger than the initial trigger time of the pipeline (d), the time complexity becomes O(N/d). Unlike the prefix computation circuits found in the literature, the design is scalable for different problem sizes including unknown sized data. The design is modular based on a finite state machine, and implemented and tested for target FPGA devices Xilinx Spartan2S XC2S300EFT256-6Q and XC2S600EFG676-6.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agarwal RK (1992) Computational fluid dynamics on parallel processors, tutorial. McDonnell Douglas Research Laboratories. In: Proc of the 6th ACM SigArch int conference on supercomputing, Washington, DC, USA, July 1992

  2. Akl SG (1997) Parallel computation: models and methods. Prentice-Hall, New York

    Google Scholar 

  3. Almasi G, Gottlieb A (1989) Highly parallel computing. Benjamin/Cummings, New York, Chapter 4

    MATH  Google Scholar 

  4. Beaumont-Smith A, Lim C (2001) Parallel prefix adder design. In: Proc of the 15th IEEE symposium on computer arithmetic, Vail, Colorado, USA, June 2001, pp 218–225

  5. Bilgory A, Gajski D (1986) A heuristic for suffix solutions. IEEE Trans Comput 35(1)

  6. Cole R, Vishkin U (1989) Faster optimal parallel prefix sums and list ranking. Inf Control 81:334–352

    MATH  MathSciNet  Google Scholar 

  7. Court TV, Herbordt MC (2004) Families of FPGA-based algorithms for approximate string matching. In: Proc of the 15th IEEE int conference on application-specific systems, architectures and processors, pp 354–364

  8. Dai HK, Su HC (2006) A parallel algorithm for finding all successive minimal maximum subsequences. In: Proceedings of LATIN 2006: theoretical informatics: 7th Latin American symposium. Valdivia, Chile, March 2006. Lecture notes in computer science, vol 3887. Springer, New York, pp 337–348

    Chapter  Google Scholar 

  9. Dimitrakopoulos G, Nikolos D (2005) High-speed parallel-prefix VLSI ling adders. IEEE Trans Comput 54(2):225–231

    Article  Google Scholar 

  10. Fich FE (1983) New bounds for parallel prefix circuits. In: Proc of the 15th annual ACM symposium on theory of computing, pp 100–109

  11. Ha S, Lee EA (1997) Compile-time scheduling of dynamic constructs in dataflow program graphs. IEEE Trans Comput 46:768–778

    Article  Google Scholar 

  12. Hadjicostis CN (2004) Coding techniques for fault-tolerant parallel prefix computations in Abelian groups. Comput J 47(3):329–341

    Article  Google Scholar 

  13. Hagerup T (1995) The parallel complexity of integer prefix summation. Inf Process Lett 56:59–64

    Article  MATH  MathSciNet  Google Scholar 

  14. Helman DR, Jaja J (1999) Prefix computations on symmetric multiprocessors. In: Proc of the 13th int parallel processing symp and 10th symp on parallel and distributed processing, San Juan, Puerto Rico, April 1999

  15. Jana PK, Naidu BD et al.(2002) Parallel prefix computation on extended multimesh network. Inf Process Lett 84(6):295–303

    Article  MATH  MathSciNet  Google Scholar 

  16. Johnsonbaugh R, Schaefer M (2004) Algorithms. Pearson/Prentice-Hall, New York

    Google Scholar 

  17. Kamakoti V, Balakrishnan N (1997) Efficient algorithms for prefix and general prefix computations on distributed shared memory systems with applications. In: Proc of the 1999 int conference on parallel and distributed systems, Seoul, Korea, Dec 1997, pp 44–51

  18. Khan J, Rajagopalan J et al. (2004) A portable face recognition system using reconfigurable hardware. In: Proc of the 2004 int conference on engineering of reconfigurable systems and algorithms, Las Vegas, USA, June 2004

  19. Ladner R (1980) M Fischer. Parallel prefix computation, J Assoc Comput Mach 27(4):831–838

    MATH  MathSciNet  Google Scholar 

  20. Lakshmivarahan S, Dhall SK (1994) Parallel computing using the prefix problem. Oxford University Press, Oxford

    Google Scholar 

  21. Lin Y-C, Chen J-N (2003) Z4: A new depth-size optimal parallel prefix circuits with small depth. Neural Parallel Sci Comput 11(3):221–236

    MATH  MathSciNet  Google Scholar 

  22. Lin Y, Hsiao J (2004) A new approach to constructing optimal parallel prefix circuits with small depth. J Parallel Distrib Comput 64(1):97–107

    Article  MATH  Google Scholar 

  23. Lin YC, Lin CM (1996) Efficient parallel prefix algorithms on fully connected message passing computers. In: Proc of the 3rd int conference on high performance computing, Trivandrum, India, Dec 1996

  24. Miller R, Boxer L (2000) Algorithms, sequential & parallel, a unified approach. Prentice-Hall, New York

    Google Scholar 

  25. Murty VS, Reghu Raj PC, Raman S (2003) Design of a high speed string matching co-processor for NLP. In: Proc of the 16th int conference on VLSI design, pp 183–188

  26. Parhami B (2002) Introduction to parallel processing: algorithms and architectures. Springer, Berlin

    Google Scholar 

  27. Park JH (2000) An efficient hardware algorithm for parallel prefix computation with resource constraints. In: Proc of the 2000 int conference on parallel & distributed processing tech and applications, Las Vegas, USA, June 2000

  28. Park JH (2005) Reconfigurable parallel approximate string matching on FPGAs. In: Proc of the 8th EUROMICRO conference on digital system design, Porto, Portugal, Aug 2005, pp 214–217

  29. Park JH, George KM (1996) Parallel history sensitive computations in dataflow architecture. In: Proc of the IEEE second international conference on algorithms & architectures for parallel processing, Singapore, June 1996, pp 522–529

  30. Ragde P (1993) The parallel simplicity of compaction and chaining. J Algorithms 14:371–380

    Article  MATH  MathSciNet  Google Scholar 

  31. Rajasekaran S, Reif JH (1989) Optimal and sublogarithmic time randomized parallel sorting algorithms. SIAM J Comput 18:594–607

    Article  MATH  MathSciNet  Google Scholar 

  32. Roch J-L, Traore D, Bernard J (2006) On-line adaptive parallel prefix computation. In: Proc of the 12th int Europar conference, Dresden, Germany, Aug 2006

  33. Scrofano R, Prasanna VK (2004) Computing Lennard-Jones potentials and forces with reconfigurable hardware. In: Proc of the 2004 int conference on engineering of reconfigurable systems and algorithms, Las Vegas, USA, June 2004

  34. Wang H, Nicolau A (1996) The strict time lower bound and optimal schedules for parallel prefix with resource constraints. IEEE Trans Comput 45(11):1257–1271

    Article  MATH  MathSciNet  Google Scholar 

  35. Xilinx Inc, http://www.xilinx.com

  36. Zhu H, Cheng C-K, Graham R (2006) On the construction of zero-deficiency parallel prefix circuits with minimum depth. ACM Trans Des Automat Electron Syst 11(2):387–409

    Article  Google Scholar 

  37. Zhuo L, Prasanna VK (2004) Scalable and modular algorithms for floating-point matrix multiplication on FPGAs. In: Proc of the 18th int parallel & distributed processing symposium, New Mexico, USA, April 2004

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Hwan Park.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, J.H., Dai, H.K. Reconfigurable hardware solution to parallel prefix computation. J Supercomput 43, 43–58 (2008). https://doi.org/10.1007/s11227-007-0137-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-007-0137-1

Keywords