9 .Efficient Design For Fixed Width Adder
9 .Efficient Design For Fixed Width Adder
9 .Efficient Design For Fixed Width Adder
Abstract
Conventionally, fixed-width adder-tree (AT) design is obtained from the full-width AT
design by employing direct or post-truncation. In direct-truncation, one lower order bit of
each adder output of full-width AT is post-truncated, and in case of post-truncation, {p}
lower order-bits of final-stage adder output are truncated, where p = dlog2 Ne and N is the
input-vector size. Both these methods does not provide an efficient design. In this paper, a
novel scheme is presented to obtain fixed-width AT design using truncated input. A bias
estimation formula based on probabilistic approach is presented to compensate the truncation
error. The proposed fixed-width AT design for input-vector sizes 8 and 16 offers
(37%,23%,22%) and (51%,30%,27%) areadelay product (ADP) saving for word-length sizes
(8,12,16), respectively, and calculates the output almost with the same accuracy as the post-
truncated fixed-width AT which has the highest accuracy among the existing fixed-width AT.
Further, we observed that Walsh-Hadamard transform based on the proposed fixed-width AT
design reconstruct higher-texture images with higher peak signal to noise ratio (PSNR) and
moderate-texture images with almost the same PSNR compared to those obtained using the
existing AT designs. Besides, the proposed design creates an additional advantage to
optimize other blocks appear at the upstream of the AT in a complex design
CHAPTER 1
INTRODUCTION
Low power, area-efficient and high-performance computing systems are increasingly used in
portable and mobile devices. For such applications, digital signal processing (DSP)
algorithms are implemented in fixed-point VLSI systems. Adder-tree (AT) commonly used in
parallel designs of inner-product computation and matrix-vector multiplication. Multiplier
design also involves a shift-adder-tree (SAT) for accumulation of partial product bits. Word-
length growth is a common problem encountered when multiplication and addition are
performed in fixed-point arithmetic.
The shape of the bitmatrix of SAT is different from the AT. Consequently, word length
grows in a different order in SAT and AT. Besides, there are few other bits also added in the
SAT to take care of negative partial products of multiplier. Specific designs have been
suggested for efficient realization of fixed-width multipliers with less truncation error [1].
However, the scheme used in fixed-width multiplier is not appropriate to develop a fixed-
width AT design due to different shaped bit-matrix. The full-width AT (FL-AT) design
produces (w + p)-bit output for every N-point input-vector, where p = log2 N. For the same
size input-vector, the fixed-width AT (FX-AT) design produces w-bit output. Conventionally,
FX-AT design is obtained from the FL-AT design by employing direct or post-truncation. In
direct-truncation (DT), one lower order bit of each adder output of FL-AT is post-truncated,
and in case post-truncation, {p} lower order-bits of final adder output of FL-AT are
truncated. In recent years, several schemes have been suggested for approximate computation
of addition using ripple carry adder (RCA) to save critical path delay (CPD) and area [2]–[5].
The bio-inspired lower part OR adder is proposed based on approximate logic [2]. Four
different types of approximate adder designs are proposed in [3]. An approximate 2-bit adder
is proposed in [5] for approximate computation of triple multiplicand without carry
propagation. These approximate designs can be used to implement RCA with less delay and
area with some loss of accuracy. The approximate RCA design can be used to obtain fixed-
width AT employing post-truncation. However, the approximate fixedwidth AT (APX-FX-
AT) does not offer an area-delay efficient design.
Bit-level optimization of FL-AT for multiple constant multiplication (MCM) is proposed to
take advantage of shifting operation [6]. An efficient FL-AT design is proposed in [7] using
the approximate adder of [3] for imprecise realization of Gaussian filter for image processing
applications. We find that the optimized AT of [6] is specific to MCM based design and none
of the existing design discusses the issues related to fixed-width implementation of AT. It is
observed that direct-truncation and post-truncation methods does not provide an efficient FX-
AT design. It is necessary to have a different approach for developing efficient FX-AT design
which is currently missing in the literature. An efficient FX-AT design certainly help to
improve the efficiency of dedicated VLSI systems implementing complex DSP algorithm.
In this research, we propose a scheme to develop an efficient FX-AT design with truncated
input. Use of truncated input in FX-AT offers two fold advantages: (1) area and delay saving
within the FX-AT due to reduction in adder-width (by p-bits), and (2) creates a scope to
optimize other computing blocks appear at the upstream of AT in a complex design.
However, the use of truncated input introduces a large amount of error in the FXAT output
which needs to be biased appropriately. The main contribution of the research are:
Use of truncated input in fixed-width AT design.
Formula to estimate the bias for error compensation.
2.1 Adders
Adders are used in many aspects [11], [12]. It is generally recognized that most of the time
required by adders is due to carry propagation, so how to reduce the propagation time is the
focus on today’s techniques. Different binary adder schemes have their own characters, such
as area and energy dissipation. No such adder scheme is the best for every condition, so to
choose in a specific context with specific requirement and constraint is important. Because
this thesis work does not focus on analysis of delay time of different adders, here the function
of some commonly used adders is given.
The number zero is identified as positive and therefore has a 0 sign bit and a magnitude of all
0s, we can see that the range of positive integers that maybe represented is from 0 to 2 n−1 −1.
Any larger number would require more bits.
2.1.2 Fixed Time Type
Most commonly implemented is the fixed time type adder scheme. The character is that no
signal is indicated when addition is completed. Therefore, the worst case delay should be
considered.
2.1.3 Variable Time Type
Contrary to fixed time type adder scheme, the variable time type adders have a completion
signal so that the result of the addition can be used as soon as the completion signal is
asserted.
The sum output of a full adder at position i as shown in Figure 1 is given by:
Si = X i ⊕Yi ⊕Ci
In the expression of the sum, Ci must be generated by the full adder at the lower position i −1.
tc is the delay from the input from the full adder to the carry output and ts is the delay form the
input to the sum output. The worst case delay is given by
TCRA = (n −1)tc + max(tc ,ts )
This adder is slow for large n. The main advantage of this adder is the simplicity of its cell
and connection among them.
The carry bit Ci+1 generated when adding two bits X i and Yi , is '1' when the function Gi is '1'
or if the CI is ’1’ and the function Pi is '1' simultaneously. In the first case, the carry bit is
activated by the local conditions (the values of X i and Yi). In the second, the carry bit is
received from the less significant elementary addition and is propagated further to the more
significant elementary addition depending on the function Pi. Therefore, the carry-out bit
corresponding to a pair of bits X i and Yi is computed according to the equation:
Ci = Gi + PiCi−1
Hence, the carry signal can be computed by carry in, Generate and Propagate signals.
C1 = G0 + P0Cin
C2 = G1 + P1G0 + P1P0Cin
C3 = G2 + P2G1 + P2P1G0 + P2P1P0Cin
C4 = G3 + P3G2 + P3P2G1 + P3P2P1G0 + P3P2P1P0Cin
Figure 2 can help us understand the carry out signal computation procedure more clearly.
The advantage of carry-lookahead adder is if we consider the input vector of n bits is divided
into groups of m bits and groups connected like a ripple-carry adder, the worst delay should
be:
The worst delay is less than ripple-carry adder because tgroups is smaller than mtc.
Hence the carry-lookahead adder is faster than ripple-carry adder.
The carry in signal is considered as an input of the CSA, and the carry out signal is
considered as an output of the CSA. Figure 5 show hown carry save adders are arranged to
adder threen bit numbers x , y , z . into two numbers c and s.
Figure 6 show the CSA compute flow and Table 1 will show how the CSA works (basic on
binary numbers).
The computation can be divided into two steps, first we compute S and C using a CSA, then
we use a CPA to compute the total sum. From this example, we can see that the carry signal
and the sum signal can be computed independently to get only two n -bits numbers. A CPA is
used for the last step computation and the carry propagation exist only in the last step.
2.1.5.2 Signed-Digit Adder (SDA)
Signed-digit (SD) number representation systems have been defined for any radix r with digit
values ranging over the set (- alpha, . . ., -1, 0, 1, . . ., alpha), where alpha r −1 is an arbitrary
integer in the range ≤ alpha ≤ r −1. Such number representation 2
systems possess sufficient redundancy to allow for the cut up of carry or borrow chains and
hence result in fast propagation-free addition and subtraction. The result of the addition uses
signed digit representation. Use fixed-radix representation with digit value from a signed-
integer set.
A.Avizienis [13] proposed a redundant binary number (a radix-2 signed-digit number). With
this type of number, the propagation of carry figures is absorbed into its redundancy and the
addition processes are unrelated to the number of digits and can be executed in only two
steps. More detail to compute ti and representation of operands has been mentioned in [14].
2.1.6 Multi-operand Addition
A common structure for adding several operands is an adder tree, such as Wallace tree,
Dadda tree, carry save adder tree and so on. In this thesis, carry save adder tree structure and
Wallace tree are used. The primitive operation performed on the inputs bit-array is reduction,
to achieve an output bit-array with a small number of bits. There are two methods used:
reduction by rows and reduction by columns, carry save adder tree belong to first method and
the Wallace tree belong to second method. Modules to reduce the rows are called adders and
reduce the columns are called counters.
From Figure 10, each column’s bit numbers are k, and have p levels. We can use [3:2] adders
to reduce the rows and get 2 bit vectors. No propagation of the carries are required except on
the last two rows which result in a speed up of the computation.
Fig2.11 Reduction by rows
From Figure 11, the number of input vectors were reduced by the rows. Finally, we should
estimate the numbers of levels of the CSA tree as
Fig.3.1. Input bit-matrix A of adder tree for N = 8 and w = 8. y0 and y1, respectively,
represents the output of FL-AT and FX-AT-PT
Fig. 3.2. (a) Full-width adder-tree for N = 8 and w = 8. (b) Fixed-width post-truncated adder-
tree (FX-AT-PT) for N = 8 and w = 8. (c) Fixed-width with 1-bit direct-truncated adder-tree
(FX-AT-DT) for N = 8 and w = 8. (d) Function of carry-generator used in final stage of FX-
AT-PT
The bit-matrix A is partitioned into two parts named most significant part (MSP) and least
significant part (LSP). The lower p columns of A forms the LSP and the upper (w − p)
columns forms the MSP. LSP columns are partially or fully truncated to design a fix-width
AT which is discussed in the following section.
Using (3) the fixed-bias for N = 8 and 16 is found to be {σ=4 and 8}. The input bit-matrix of
TFX-AT with fixedbias is shown in Fig.3(a) for N = 8 and w = 8. The binary values of σ=4 is
added to the least significant column of MSP for error-compensation. Structure of proposed
TFX-AT with fixed-bias is shown in Fig.3(b)
3.4 Proposed Improved Truncated Fixed-width Adder-Tree
To estimate the bias of the truncated part more precisely the LSP of A is further partitioned
into two parts as major-part and minor-part. The most significant column of LSP constitutes
the major-part (MJP) and the remaining (p − 1) lower order columns constitute the minor-part
(MNP). The bias in this case is estimated using the MJP and MNP of LSP as:
σ = σmajor + σminor (4)
where, σmajor and σminor are the estimated bias of MJP and MNP of LSP. σmajor is estimated
accurately using the actual signal value of MJP where the σminor is estimated using the
probabilistic approach. The quantized value of σmajor is estimated using the relation:
The input bit-matrix of the proposed improved truncated fixed-width adder-tree (ITFX-AT) is
shown in Fig.4 for N = 8 and w = 8. The logic-block corresponding to σmajor calculates the
carry bits {c0,c1,c2,c3,c4,c5,c6}. These carry bits and the fixed-bias corresponding to σminor are
added to the least significant column of MSP. According to (6b), the value of σminor for N = 8
is found to be 2. The structure of proposed ITFX-AT is shown in Fig.5(a). The seven
halfadders (A) connected in a tree structure calculates the carry bits {c0,c1,c2,c3,c4,c5,c6}
corresponding to σmajor. Out of these 7 half-adders, 4 half-adders of first tree level are replaced
with four full-adders with fixed input-carry 1 to add the fixedbias (+4) to the MJP of LSP
instead of least significant column of MSP. Full-adders with fixed input-carry ’1’ are further
optimized into a modified half-adder (A*) comprising of a XNOR and OR-gate.
(a)
Fig. 3.3(a) Input bit-matrix of proposed truncated fixed-width adder-tree
(TFX-AT) with fixed-bias for error-compensation. (b) Structure of proposed TFX-AT design
Fig. 3.4 Input bit-matrix of proposed improved truncated fixed-width adder tree (ITFX-AT)
for N = 8,w = 8.
MJP and MNP represents the major part and minor part of LSP. {c0,c1,c2,c3,c4,c5,c6} represent
the carry bits corresponding to the estimate of σmajor
Fig.3.5 (a) Structure of proposed ITFX-AT. (b) Logic function of half-adder (A).
(c) Logic function of modified adder (A*) (d) Logic function of carry cell
(a)
(b)
Fig. 3.6 (a) Structure of approximate fixed-width adder-tree (APX-FX-AT) using accurate
RCA and 3-bit approximate (APX) RCA of [3] for N = 8 and w = 8. (b) Approximate Full
Adder (AFA) {Type-1, Type-2, Type-3, Type-4}
Approximate full-adders of [3] are considered to add the LSP of the input matrix for reducing
the logic complexity and CPD of the FL-AT. The approximate adder (AXA) is implemented
using the approximate full-adder (AFA) {Type1, Type-2, Type-3 and Type-4} of [3]. Post-
truncated approximate FX-AT (APX-FX-AT-PT) is obtained from approximate full-width
adder-tree (APX-FL-AT) to study the performance of the proposed FX-AT designs. Note that
APX-FL-AT-Type-4 is identical to the APX-AT of [7]. Structure of APX-FX-AT-PT is
shown in Fig.6(a) using accurate RCA and 3-bit approximate RCA for N = 8 and w = 8.
CHAPTER 4
SOFTWARE REQUIREMENT
Keywords
It has some keywords. eg: always Escaped Identifier: It contains ASCII character in
identifier. e.g: Gate
Comments
The two ways for writing the comments are
a)/*Four-bit shift register*/
b) // Four-bit shift register
Format
Construct is written using one or multiple lines.
eg1:
initial
begin
A=0;
Y=0; …
end
Value Set
It contains four values. They are
1. 0 [logic 0]
2. 1 [logic 1]
3. Z [High impedance]
4. X [Unknown value]
4.3 Modeling
In Verilog HDL, three modelling are used to write the coding. They are
The above gates have more number of inputs & only one output.
Syntax: gate type [instance name] (Output, input 1, input 2,.....input N)
Tristate Gates
Tristate gates have 1 output, one data input and one control input. These are
buf if 1(output =z, if control input is 0)
buf if 0(output=z, if control is one, or data is transferred to output)
not if 1(output is z, if control is one, or output = ~(input))
not if 0(output is z, if control is one, or output = ~(input))
Gate Delays
The time taken for propagation of signal from the input of gate to the output of gate is
specified using gate delays.
Syntax: gate name [delay] [instance name] (terminal list);
Array of Instances
Syntax: gate name[delay] instance name[left bound : right bound](terminal list)
Continuous Assignments
This one assigns value to a net.
Syntax: assign target = expression;
Various Delays
assign #(3,6)z = x & y; ( Rise delay =3, Fall delay = 6, Transition delay = 3)
assign z = x & y; (No delay)
The behavioural modelling is the third type of modelling used in this HDL.
Procedural Constructs
Procedural constructs are of two types. They are
1) Initial statement
2) Always statement
1) Initial Statement
Syntax: initial [timing control] procedural statement;
2) Always Statement
Syntax: always [timing control] procedural statement;
4.4 Modelsim
Modelsim software simulates VHDL, Verilog HDL, System C and built in C debugger.
The simulation is performed with GUI. It uses an unified kernel for all supporting
languages simulation. It enables simulation, debugging & verification of supported
languages.
4.5 Quartus II
Quartus II software is introduced by Altera for PLD design. It analyses and synthesis
HDL language. The developer is able to compile the designs, do timing analysis, allows
the programmer to configuring the target device, examine the diagram of RTL,
implement HDL etc,.
FPGA consists of configurable logic blocks along with configurable interconnects. The
design engineers configure such devices to design several tasks. The main advantage of
FPGA is that parallel execution of code. Also, the requirement of RAM is reduced. Another
advantage is that it does not rely on word length. Efficiency is maximum when utilizing
smallest word length. It also allows greater flexibility.
Fig4.1: FPGA Structure
FEATURES
View Pane
The View pane radio buttons enable us to view the source modules associated with the
Implementation or Simulation Design View in the Hierarchy pane. If we select Simulation,
we must select a simulation phase from the drop-down
Hierarchy Pane
The Hierarchy pane displays the project name, the target device, user documents, and
design source files associated with the selected Design View. The View pane at the
top of the Design panel allows you to view only those source files associated with the
selected Design View, such as Implementation or Simulation.
Each file in the Hierarchy pane has an associated icon. The icon indicates the file type
(HDL file, schematic, core, or text file, for example). For a complete list of possible
sources types and their associated icons, see the “Source File Types” topic in the ISE
Help. From Project Navigator, select Help > Help Topics to view the ISE Help.
If a file contains lower levels of hierarchy, the icon has a plus symbol (+) to the left of
the name. We can expand the hierarchy by clicking the plus symbol (+). We can open a file
for editing by double-clicking on the filename. Processes Pane
The Processes pane is context sensitive, and it changes based upon the source
type selected in the Sources pane and the top-level source in our project. From the Processes
pane, we can run the functions necessary to define, run, and analyse your design. The
Processes pane provides access to the following functions:
• Design Summary/Reports
Provides access to design reports, messages, and summary of results data.
Message filtering can also be performed.
• Design Utilities
Provides access to symbol generation, instantiation templates, viewing
command line history, and simulation library compilation.
• User Constraints
Provides access to editing location and timing constraints.
• Synthesis
Provides access to Check Syntax, Synthesis, View RTL or Technology Schematic, and
synthesis reports. Available processes vary depending on the synthesis tools we use.
• Implement Design
Provides access to implementation tools and post-implementation analysis tools.
The Processes pane incorporates dependency management technology. The tools keep track
of which processes have been run and which processes need to be run. Graphical status
indicators display the state of the flow at any given time. When you select a process in the
flow, the software automatically runs the processes necessary to get to the desired step. For
example, when you run the Implement Design process, Project Navigator also runs the
Synthesis process because implementation is dependent on up-to-date synthesis results. To
view a running log of command line arguments used on the current project, expand Design
Utilities and select View Command Line Log File Files Panel. The Files panel provides a
flat, sortable list of all the source files in the project. Files can be sorted by any of the
columns in the view. Properties for each file can be viewed and modified by right-clicking
on the file and selecting Source Properties.
Libraries Panel
The Libraries panel enables you to manage HDL libraries and their associated HDL source
files. You can create, view, and edit libraries and their associated sources.
Console Panel
The Console provides all standard output from processes run from Project Navigator. It
displays errors, warnings, and information messages. Errors are signified by a red X next to
the message; while warnings have a yellow exclamation mark (!).
Errors Panel
The Errors panel displays only error messages. Other console messages are filtered out.
Warnings Panel
The Warnings panel displays only warning messages. Other console messages are filtered
out.
Workspace
The Workspace is where design editors, viewers, and analysis tools open. These include
ISE Text Editor, Schematic Editor, Constraint Editor, Design Summary/Report Viewer,
RTL and Technology Viewers, and Timing Analyzer. Other tools such as the PlanAhead™
tool for I/O planning and floorplanning, ISim, third-party text editors, Xpower Analyzer,
and iMPACT open in separate windows outside the main Project Navigator environment
when invoked.
Design Summary/Report Viewer
The Design Summary provides a summary of key design data as well as access to all of
the messages and detailed reports from the synthesis and implementation tools. The
summary lists high-level information about your project, including overview information, a
device utilization summary, performance data gathered from the Place and Route (PAR)
report, constraints information, and summary information from all reports with links to the
individual reports. A link to the System Settings report provides information on
environment variables and tool settings used during the design implementation. Messaging
features such as message filtering, tagging, and incremental messaging are also available
from this view
This chapter guides you through a typical HDL-based design procedure using a design
of a runner’s stopwatch. The design example used in this tutorial demonstrates many device
features, software features, and design flow practices you can apply to your own design.
This design targets a Spartan®-3A device; however, all of the principles and flows taught
are applicable to any Xilinx® device family, unless otherwise noted. The design is
composed of HDL elements and two cores. You can synthesize the design using Xilinx
Synthesis Technology (XST), Synplify/Synplify Pro, or Precision software
Required Software
To perform this tutorial, you must have Xilinx ISE® Design Suite installed. This tutorial
assumes that the software is installed in the default location c:\xilinx\release_number\
ISE_DS\ISE. If you installed the software in a different location, substitute your installation
path in the procedures that follow.
Note: For detailed software installation instructions, refer to the Xilinx Design Tools:
Installation and Licensing Guide (UG798) available from the Xilinx website
VERILOG
This software supports both VHDL and Verilog designs and applies to both designs
simultaneously, noting differences where applicable. You will need to decide which HDL
language you would like to work through for the tutorial and download the appropriate files
for that language. XST can synthesize a mixed-language design. However, this tutorial
does not cover the mixed language feature. Starting the ISE Design Suite To start the ISE
Design Suite, double-click the Project Navigator icon on your desktop, or select Start > All
Programs > Xilinx ISE Design Suite > Xilinx Design Suite 14 > ISE Design Tools >
Project Navigator.
Fig4.3: Project navigator Desktop
Inputs
The following are input signals for the tutorial stopwatch design:
• strtstop
Starts and stops the stopwatch. This is an active low signal which acts like the start/
stop button on a runner’s stopwatch.
• reset
Puts the stopwatch in clocking mode and resets the time to 0:00:00.
• clk
Externally generated system clock.
• mode
Toggles between clocking and timer modes. This input is only functional while the
clock or timer is not counting.
• lap_load
This is a dual function signal. In clocking mode, it displays the current clock value
in the ‘Lap’ display area. In timer mode, it loads the pre-assigned values from the
ROM to the timer display when the timer is not counting.
Outputs
The following are outputs signals for the design:
• lcd_e, lcd_rs, lcd_rw
These outputs are the control signals for the LCD display of the Spartan-3A demo
board used to display the stopwatch times.
• sf_d[7:0]
Provides the data values for the LCD display.
Functional Blocks
The completed design consists of the following functional blocks:
• clk_div_262k
Macro that divides a clock frequency by 262,144. Converts 26.2144 MHz clock into
100 Hz 50% duty cycle clock.
• dcm1
Clocking Wizard macro with internal feedback, frequency controlled output, and
duty-cycle correction. The CLKFX_OUT output converts the 50 MHz clock of the
Spartan-3A demo board to 26.2144 MHz.
• debounce
Schematic module implementing a simplistic debounce circuit for the
strtstop, mode, and lap_load input signals.
• lcd_control
Module controlling the initialization of and output to the LCD display.
• statmach
State machine HDL module that controls the state of the stopwatch.
• timer_preset
CORE Generator™ tool 64x20 ROM. This macro contains 64 preset times from
0:00:00 to 9:59:99 that can be loaded into the timer.
• time_cnt
Up/down counter module that counts between 0:00:00 to 9:59:99 decimal. This macro
has five 4-bit outputs, which represent the digits of the stopwatch time.
Changing the design flow results in the deletion of implementation data. We have not yet
created any implementation data in this tutorial. For projects that contain
implementation data, Xilinx recommends that we make a copy of the project using File >
Copy Project if one would like to make a backup of the project before continuing.
Synthesizing the Design
Using XST Now that we have created and analyzed the design, the next step is to
synthesize the design. During synthesis, the HDL files are translated into gates and
optimized for the target architecture.Processes available for synthesis using XST are as
follows:
• Check Syntax
-Verifies that the HDL code is entered properly.
Now we are ready to synthesize our design. To take the HDL code and generate a compatible
netlist, do the following:
1. In the Hierarchy pane, select stopwatch.vhd(or stopwatch.v).
2. In the Processes pane, double-click the Synthesize process
3. Using the RTL/Technology Viewer XST can generate a schematic representation
of the HDL code that we have entered. A schematic view of the code helps us
analyze our design by displaying a graphical connection between the various
components that XST has inferred. Following are the two forms of schematic
representation:
• RTL View: Pre-optimization of the HDL code.
• Technology View: Post-synthesis view of the HDL design mapped to the target
technology.
To view a schematic representation of the HDL code, do the following:
1. In the Processes pane, expand Synthesize, and double-click View RTL Schematic
or View Technology Schematic.
2. If the Set RTL/Tech Viewer Startup Mode dialog appears, select Start with the
Explorer Wizard.
3. In the Create Schematic start page, select the clk_divider and lap_load_debounce
components from the Available Elements list, and then click the Add button to move
the selected items to the Selected Elements list.
4. Click Create Schematic. The schematic viewer allows us to select the portions of
the design to display as schematics. When the schematic is displayed, double-click on
the symbol to push into the schematic and view the various design elements and
connectivity. Right-click the schematic to view the various operations that can be
performed in the schematic viewer.
In this research, a novel scheme is presented to obtain fixed width AT design using truncated
input. A bias estimation formula based on probabilistic approach is presented to compensate
the truncation error. Based on the proposed scheme, two separate fixed-width AT designs are
derived. Both the proposed designs offer a substantial amount of area and CPD saving over
the existing fixed-width AT designs. For vector sizes 8 and 16, the proposed ITFX-AT offers
(37%,23%,22%) and (51%,30%,27%) ADP saving for word-length sizes (8,12,16),
respectively, and calculates the output almost with the same accuracy as the post-truncated
fixed-width AT which has the highest accuracy among the existing fixed-width AT. Further,
we observed that Walsh-Hadamard transform based on proposed adder design reconstruct
higher-texture images with higher PSNR and moderate-texture images with almost the same
PSNR compared to those obtained from the existing fixed-width adder designs. Besides, the
use of truncated input samples in fixed-width AT design is an interesting feature which
creates an additional advantage to optimize other blocks appear at the upstream of the AT in a
complex design.
Advantages:
Proposed system used truncated input for efficient fixed width adder tree design
Better Area Delay Product(ADP)
Using bias estimation formula based on probabilistic approach for compensating the
truncation error.
REFERENCES
[1] B. K. Mohanty and V. Tiwari, “Modified probabilistic estimation bias formulation for
hardware efficient fixed-width Booth multiplier”, Circuits, Systems and Signal
Processing, Springer, vol.33, no.12, pp. 3981– 3994, Dec., 2014
[4] J. Liang, J. Han and F. Lombardi, “New metrics for the reliability of approximation and
probabilistic adders”, IEEE Transactions on Computers, vol. 62, no. 9, pp. 1760–1771,
Sept.2013.
[5] H. Jiang, J. Han, F. Qiao and F. Lambardi, “Approximate radix-8 Booth multipliers for
low-power and high-performance operations”, IEEE Transactions on Computers, vol.
65, no. 8, pp. 2638–2644, Aug.2016.
[6] Y. Pan and P. K. Meher, “Bit-level optimization of adder-trees for multiplie constant
multiplications for effcient FIR filter implementation”, IEEE Transactions on Circuits
and Systems-I, Regular Papers, vol. 61, no. 2, pp. 455–462, Feb.2014.
[8] D. Gizopoulos, M. Psarakis, A. Paschalis, and Y. Zorian, “Easily Testable Cellular Carry
Lookahead Adders,” Journal of Electronic Testing: Theory and Applications 19, 285-
298, 2003.
[9] S. Xing and W. W. H. Yu, “FPGA Adders: Performance Evaluation and Optimal
Design,” IEEE Design & Test of Computers, vol. 15, no. 1, pp. 24-29, Jan. 1998.
[11] P. M. Kogge and H. S. Stone, “A Parallel Algorithm for the Efficient Solution of a
General Class of Recurrence Equations,” IEEE Trans. on Computers, Vol. C-22, No 8,
August 1973.
[12] [6]. P. Ndai, S. Lu, D. Somesekhar, and K. Roy, “Fine-Grained Redundancy in Adders,”
Int. Symp. on Quality Electronic Design, pp. 317-321, March 2007.
[13] [7]. T. Lynch and E. E. Swartzlander, “A Spanning Tree Carry Lookahead Adder,” IEEE
Trans. on Computers, vol. 41, no. 8, pp. 931-939, Aug. 1992. [8]. N. H. E. Weste and D.
Harris, CMOS VLSI Design, 4th edition, Pearson–AddisonWesley, 2011.
[14] [9]. R. P. Brent and H. T. Kung, “A regular layout for parallel adders,” IEEE Trans.
Comput., vol. C-31, pp. 260-264, 1982.
[15] [10]. D. Harris, “A Taxonomy of Parallel Prefix Networks,” in Proc. 37th Asilomar
Conf. Signals Systems and Computers, pp. 2213–7, 2003.
[17] http://seamless-pixels.blogspot.in/2014/07/grass-2-turf-lawn-greenground-field.html
[18] https://www.decoist.com/2013-02-28/flower-beds/
[19] http://www.cosasexclusivas.com/2014/06/daily-overview-el-planetatierra-visto.html
[20] https://healthtipsfr.blogspot.com/2017/04/blog-post−40.html
[21] http://www.ugaoo.com/knowledge-center/how-to-design-a-flower-bed/