Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) Issue 174, Volume 74, Number 01, August 2020 ISSN: 2349-4689 High Speed and Area Efficient VLSI Architecture for Radix-4 Complex Booth Multiplier ShaluTurker 1, Prof. Amrita Khera2 1 Research Scholar, Department of Electronics and Communication, Trinity Institute of Technology & Research, Bhopal 2 Assistant Professor, Department of Electronics and Communication, Trinity Institute of Technology & Research, Bhopal Abstract—The main objective of this research paper is to design architecture for radix-4 complex multiplier by rectifying the problems in the existing method and to improve the speed by using the common Boolean logic (CBL). The multiplier algorithm is normally used for higher bit length applications and ordinary multiplier is good for lower order bits. These two methods are combined to produce the high speed multiplier for higher bit length applications. The problem of existing architecture is reduced by removing bits from the remainders. The proposed algorithm is implementation Xilinx software with Vertex-7 device family. Keywords: -Vedic Multiplier, Complex Multiplier, Common Boolean Logic Adder, Xilinx Software. I. INTRODUCTION Multipliers are widely used with constant growth of computer applications. Multiplication is an important fundamental function in arithmetic operations. The performance of computer applications are mostly depends on the performance of multiplication. The major factors in the design or implementation of multipliers are chip area and speed of multiplication [1]. There is a high demand of high speed multiplications which requires less hardware. Various algorithms and the architectures have been proposed to design high-speed and low-power multipliers. A multiplier which uses Modified Booth Algorithm are designed taking into account the less area consumption of booth algorithm because of less number of partial products and speeder accumulation of partial products and less power consumption of partial products addition using adder. The Booth's algorithm performs the encoding process serially. Hence the modified booth algorithm, which is proposed performs encoding in parallel and is implemented to design fast multiplier. Ripple carry adder and carry look ahead adder are employed for high speed accumulation [2]. The generation of each of these odd multiplies a two term addition or subtractions, yielding a total of carry propagate additions. However, the advantage of the high radix is that the partial product is further reduced. For instance, for radix-16and n-bit operands, about n/4 partial products are generated. Although less popular than radix-4, there industrial instances of radix-8 [3] and radix-16 multipliers [4] in microprocessors implementations. The choice of www.ijspr.com these radices is related to area/delay/power of pipelined multipliers (or fused multiplier address in the case of a Intel Itanium microprocessor [5]), for balancing delay between stages and/or reduce the number of pipelining flip-flops. Today highly energy-delay optimized, while partial product reductions trees suffer the increasingly serious problems related a complex wiring and glitching due to unbalanced signal. Optimal pipelining in fact, is a key in current and future multiplier (or multiplier-add) units: 1) the of the pipelined unit is very important, even for throughput oriented applications, as it impacts the energy of the whole core [5]; and 2) the placement of the pipelining flip-flops should at the same time minimize total power, due to the number of flip-flops required and the signal propagation paths. Two’s complement radix-4 Booth multipliers, thus leaving open the research and extension to higher radices and unsigned multiplications unsigned integer arithmetic or mantissa times mantissa in a floating-point unit). Unsigned multiplication produces a positive carry out during recoding (this depends to one additional row, increasing the maximum height of the partial product array by one row, not just in one but in several columns. For all these reasons, we need to extend techniques in [6]. The Booth algorithm was invented by A. D. Booth, forms the base of Signed number multiplication algorithms that are simple to implement at the hardware level, and that have the potential to speed up signed multiplication Considerably. Booth's algorithm is based upon recoding the multiplier, y, to a recoded, value, z, leaving the multiplicand, x, unchanged. In Booth recoding, each digit of the multiplier can assume negative as well as positive and zero values. There is a special notation, called signed digit (SD) encoding, to express these signed digits. In SD encoding +1 and 0 are expressed as 1 and 0, but -1 is expressed as 1 [7]. Among them just two sutras are pertinent for increased activity. They are UrdhavaTriyakbhyam sutra (truly implies vertically and across) and Nikhilam Sutra (truly implies All from 9 and last from 10). UrdhavaTriyakbhyam is a non-specific technique for augmentation. The rationale behind UrdhavaTriyakbhyam sutra is especially like the conventional cluster multiplier. IJSPR | 41 INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) Issue 174, Volume 74, Number 01, August 2020 Here the paired usage of this calculation is determined in light of a similar rationale utilized for decimal numbers. The double usage of Nikhilam Sutra isn't yet effective. Multiplier furthermore, basic Boolean rationale snake can contrast and regular strategy which is processed by Vedic multiplier, XOR entryway and half viper. Proposed procedure gives less way delay and less territory. Information grouping of Conventional strategy is significantly more than to proposed technique; however proposed technique has less spread postponement. Region and engendering postponement can be decreased by the guide of basic Boolean rationale viper. This viper will be composed like as swell convey snake. ISSN: 2349-4689 Ar and Ai is speaks to the genuine and fanciful piece of the principal contribution of the unpredictable multiplier. Br and Bi is speaks to the genuine and nonexistent piece of the second contribution of the unpredictable multiplier. Complex multiplier for four Vedic multipliers is shown in figure 1. In this block diagram reduce four Vedic multipliers to three Vedic multipliers is shown in below: Pr = Ar × Br − Ai × Bi = Ar ( Br + Bi ) − Bi ( Ar + Ai ) (7) Pi = Ar × Bi + Ai × Br = Ar ( Br + Bi ) + Br ( Ai − Ar ) (8) II. COMPLEX MULTIPLIER Suppose two numbers are complex then A = Ar + jAi (1) B = Br + jBi (2) The product of A and B then P = A× B (3) P = Ar × Br − Ai × Bi + j ( Ar × Bi + Ai × Br ) (4) Pr = Ar × Br − Ai × Bi (5) Figure 2: Block Diagram of Complex Multiplier for three Vedic Multiplier Pi = Ar × Bi + Ai × Br (6) III. PARTITION MULTIPLIER USING CBL Where Prand Pi is speaks to the genuine and nonexistent piece of the yield of the mind boggling multiplier. In our proposed method the high speed Vedic multiplier method is replaced by the partition multiplier method which claims to provide a better speed and less propagation delay. Here we have used four multipliers M0, M1, N0 and N1 of 4-bit to perform 8-bit multiplication. The method used is the addition of all partial product formed by the cross multiplication of one bit with another. The LSB bits of first multiplier M0 (3-0) add with LSB bits N0 (3-0) of the final output t1. Padding n/2-bit zero add with final output t1. Another bits of first multiplier M0 (7-4) are added in series with LSB 4-bits of second multiplier N0 (30) to form the 8-bits, which in turn get added padding (n/4) zero with t2 and Padding (n/4) zero of the final output (15-0). Common Boolean logic Adder Figure 1: Block Diagram of Complex Multiplier for four Vedic Multiplier www.ijspr.com In a zone effective and low power half snake based Carry select viper (CSLA) utilizing normal Boolean rationale is outlined so as to upgrade the general framework execution as far as territory and power as contrast with other existing designs. Half viper is utilized to produce the incomplete entirety for cin=0 and basic Boolean rationale (CBL) is utilized for figuring halfway total for cin=1. IJSPR | 42 INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) Issue 174, Volume 74, Number 01, August 2020 ISSN: 2349-4689 table 1. Similarly when radix-8 booth algorithm is applied to multiplier of 8-bits each group will consists of four bits and the number of groups formed is 3. For 8x8 multiplications, radix-4 uses four stages to compute the final product and radix-8 booth algorithm uses three stages to compute the product. In this thesis, radix-4 booth algorithm is used for 8x8 multiplications because number components used in radix-4 encoding style. Table 1: Truth Table for Radix-4 Booth algorithm Figure 3: Block Diagram of CBL Adder This engineering is utilized to expel the recreated viper cells in the traditional CSLA, spare number of entryway tallies and accomplish a low power. Through investigating reality table of a solitary piece full snake we recommend that for producing yield summation and convey motion for cin=0, require just a single XOR door and one AND entryway individually, the yield summation motion for cin=1 is simply the opposite as cin=0. Summed up figure of regular Boolean rationale Adder is appeared in figure 3. IV. RADIX-4 ALGORITHM To further decrease the number of partial products, algorithms with higher radix value are used. In radix-4 algorithm grouping of multiplier bits is done in such a way that each group consists of 3 bits as mentioned in table 1. Similarly the next pair is the overlapping of the first pair in which MSB of the first pair will be the LSB of the second pair and other two bits. Number of groups formed is dependent on number of multiplier bits. By applying this algorithm, the number of partial product rows to be accumulated is reduced from n in radix-2 algorithm to n/2 in radix-4 algorithm. The grouping of multiplier bits for 8bit of multiplication is shown in figure 4. V. SIMULATION ANALYSIS Simulation of these tests should be possible by utilizing Xilinx 14.1i VHDL instrument. In this paper we are concentrating on engendering delay. Spread postpone must be less for better execution of advanced circuit. As appeared in table II the quantity of cut, number of LUTs, delay is acquired for the complex Vedic multiplier utilizing basic Boolean rationale viper and past calculation. From the investigation of the outcomes, it is discovered that the complex Vedic multiplier utilizing basic Boolean rationale snake gives a predominant execution as contrasted and past calculation for Xilinx programming. Figure 4: Grouping of multiplier bits in Radix-4 Booth algorithm For 8-bit multiplier the number groups formed is four using radix-4 booth algorithm. Compared to radix-2 booth algorithm the number of partial products obtained in radix-4 booth algorithm is half because for 8-bit multiplier radix-2 algorithm produces eight partial products. The truth table and the respective operation are depicted in www.ijspr.com Figure 5: View Technology Schematic of Complex Multiplier using Radix-4 Booth Multiplier IJSPR | 43 INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) Issue 174, Volume 74, Number 01, August 2020 ISSN: 2349-4689 Previous Complex Multiplier [2] Complex Radix-4 Multiplier using Ripple Carry Adder Complex Radix-4 Multiplier using CBL Adder Figure 6: RTL View of Complex Multiplier using Radix-4 Booth Multiplier of LUTs of IOBs 10416 256 25.979 ns 10642 256 26.927 ns 9892 256 24.006 ns Figure 9: Bar graph of the 32-bit Radix-4 Complex Vedic multiplier for LUTs Figure 7: Output Binary Waveform of Radix-4 Complex Multiplier using CBL Adder Figure 10: Bar graph of the 32-bit Radix-4 Complex Vedic multiplier for Delay Figure 9 and figure 10 shows the graphical illustration of the performance of CM using CBL adder discussed in this research work in term of number of LUTs and delay. From the above graphical representation it can be inferred that the CM using CBL adder gives the best performance as compared with previous algorithm. Figure 8: Output Decimal Waveform of Radix-4 Complex Multiplier using CBL Adder From the analysis of the results, it is found that the complex multiplier using CBL adder gives a superior performance as compared with previous algorithm for Vertex-7 device family. The output waveform of the complex multiplier using CBL adder is shown in figure 7 and figure 8 respectively. Table II: Comparison Result for 32-bit Radix-4 Complex Multiplier for CBL Adder Design www.ijspr.com Number Number Delay VI. CONCLUSION In this paper design of CBL adder, partition multiplier, radix-4 multiplierand complex multiplier is presented. From implementation results it is observed that the complex multiplier consumes less delay compare to previous design. The architecture designs of 32 x32-bit; Modified Radix-4 Booth Encoder Multiplier is done. REFERENCES [1] D. Kalaiyarasi and M. Saraswathi, “Design of an Efficient High Speed Radix-4 Booth Multiplier for both Signed and Unsigned Numbers”, 4th International Conference on Advances in Electrical, Electronics, Information, IJSPR | 44 INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) Issue 174, Volume 74, Number 01, August 2020 ISSN: 2349-4689 Communication and Bio-Informatics (AEEICB), IEEE 2018. [2] Prof. S. B. Patil, Miss. Pritam H. Langade, “Design of Improved Systolic Array Multiplier and Its Implementation on FPGA”, International Journal of Engineering Research and General Science Volume 3, Issue 6, NovemberDecember, 2015 [3] ElisardoAntelo, Paolo Montuschi and Alberto Nannarelli, “Improved 64-bit Radix-16 Booth Multiplier Based on Partial Product Array Height Reduction”, IEEE Transactions On Circuits And Systems—I: Regular Papers, Vol. 64, No. 2, February 2017. [4] Kavita and Jasbir Kaur, “Design and Implementation of an Efficient Modified Booth Multiplier using VHDL”, Special Issue: Proceedings of 2nd International Conference on Emerging Trends in Engineering and Management, ICETEM 2013. [5] Shiann-RongKuang, Jiun-Ping Wang and Cang-Yuan Guo, “Modified Booth Multipliers With a Regular Partial Product Array”, IEEE Transactions On Circuits And Systems—Ii: Express Briefs, Vol. 56, No. 5, May 2009. [6] S. Vassiliadis, E. Schwarz, and D. Hanrahan, “A general proof for overlapped multiple-bit scanning multiplications,” IEEE Trans. Comput., vol. 38, no. 2, pp. 172–183, Feb. 1989. [7] D. Dobberpuhl et al., “A 200-MHz 64-b dual-issue CMOS microprocessor,” IEEE J. Solid-State Circuits, vol. 27, no. 11, pp. 1555–1567, Nov. 1992. [8] E. M. Schwarz, R. M. A. III, and L. J. Sigal, “A radix-8 CMOS S/390 multiplier,” in Proc. 13th IEEE Symp. Comput. Arithmetic (ARITH), Jul. 1997, pp. 2–9. [9] J.Clouser etal.,“A600-MHz superscalar floating-point processor,” IEEE J. Solid-State Circuits, vol. 34, no. 7, pp. 1026–1029, Jul. 1999. [10] S. Oberman, “Floating point division and square root algorithms and implementation in the AMD-K7 microprocessor,” in Proc. 14th IEEE Symp. Comput. Arithmetic (ARITH), Apr. 1999, pp. 106–115. [11] R. Senthinathan et al., “A 650-MHz, IA-32 microprocessor with enhanced data streaming for graphics and video,” IEEE J. Solid-State Circuits, vol. 34, no. 11, pp. 1454–1465, Nov. 1999. www.ijspr.com IJSPR | 45