Major

CHAPTER-1
INTRODUCTION
With the latest advancement of VLSI technology, digital signal processing plays a
pivotal role in many areas of electrical and electronics engineering. High speed
convolution and deconvolution is central to many applications of Digital Signal
Processing and Image Processing, Convolution and Deconvolution having extreme
importance in Digital signal processing. Convolution is having wide area of application
like designing the digital filter, correlation etc. However it is quite difficult for the new
candidate to perform convolution and deconvolution as convolution and deconvolution
method is so lengthy and time consuming, beginners often struggle with convolution and
deconvolution because the concept and computation requires a number of steps that are
tedious and slow to perform. Many methods are proposed for performing convolution,
one of a tough approach is a Graphical method, it is quite sophisticated and systematic
but, it is very lengthy and time consuming.
Therefore many of the researchers have been trying to improve performance

parameters of convolution and deconvolution system using new algorithms and hardware.
Complexity and excess time consumption are always the major concern of engineers
which motivates them to focus on more advance and simpler techniques. Pierre and John
have implemented a fast method for computing linear convolution. This method is similar
to the multiplication of two decimal numbers and this similarity makes this method easy
to learn and quick to compute. Also to compute deconvolution of two finite length
sequences, a novel method is used. This method is similar to computing long-hand
division and polynomial division. Principal components required for implementation of
convolution calculation are adder and multiplier for partial multiplication. Therefore the
partial multiplication and addition are bottleneck in deciding the overall speed of the
convolution implementation technique. As adder is also an important block for the
proposed method, so all required possible adders are studied. Adders which have the
highest speed and occupy a comparatively less area, are selected for implementing
convolution. Since the execution time in most DSP algorithms mainly depends upon the
time required for multiplication, so there is a need of high speed multiplier.
1
Now a days, time required in multiplication process is still the dominant factor in
determining the instruction cycle time of a DSP chip. Traditionally shift and add
algorithm is being used for designing. However this is not suitable for VLSI
implementation and also from delay point of view. Some of the important algorithms
proposed in literature for VLSI implementable fast multiplication are Booth multiplier,
array multiplier and Wallace tree multiplier. Although these multiplication techniques
have been effective over conventional “shift and add” technique but their disadvantage of
time consumption has not been completely removed.
2
CHAPTER-2
VEDIC MATHEMATICS
The “Vedic Mathematics” is called so because of its origin from Vedas. To be
more specific, it has originated from “Atharva Vedas” the fourth Veda. “Atharva Veda”
deals with the branches like Engineering, Mathematics, sculpture, Medicine, and all other
sciences with which we are today aware of. The Sanskrit word Veda is derived from the
root Vid, meaning to know without limit. The word Veda covers all Veda-Sakhas known
to humanity. The Veda is a repository of all knowledge, fathomless, ever revealing as it is
delved deeper. Vedic mathematics, which simplifies arithmetic and algebraic operations,
has increasingly found acceptance the world over. Experts suggest that it could be a
handy tool for those who need to solve mathematical problems faster by the day. It is an
ancient technique, which simplifies multiplication, divisibility, complex numbers,
squaring, cubing, square roots and cube roots. Even recurring decimals and auxiliary
fractions can be handled by Vedic mathematics. Vedic Mathematics forms part of Jyotish
Shastra which is one of the six parts of Vedangas. The Jyotish Shastra or Astronomy is
made up of three parts called Skandas. A Skanda means the big branch of a tree shooting
out of the trunk. This subject was revived largely due to the efforts of Jagadguru Swami
Bharathi Krishna Tirtha Ji of Govardhan Peeth, Puri Jaganath (1884-1960). Having
researched the subject for years, even his efforts would have gone in vain but for the
enterprise of some disciples who took down notes during his last days. The basis of Vedic
mathematics, are the 16 sutras, which attribute a set of qualities to a number or a group of
numbers. The ancient Hindu scientists (Rishis) of Bharat in 16 Sutras (Phrases) and 120
words laid down simple steps for solving all mathematical problems in easy to follow 2 or
3 steps. Vedic Mental or one or two line methods can be used effectively for solving
divisions, reciprocals, factorisation, HCF, squares and square roots, cubes and cube roots,
algebraic equations, multiple simultaneous equations, quadratic equations, cubic
equations, biquadratic equations, higher degree equations, differential calculus, Partial
fractions, Integrations, Pythogorus Theoram, Apollonius Theoram, Analytical Conics and
so on. Vedic scholars did not use figures for big numbers in their numerical notation.
Instead, they preferred to use the Sanskrit alphabets, with each alphabet constituting a
number.
3
Several mantras, in fact, denote numbers; that includes the famed Gayatri Mantra,
which adds to 108 when decoded. How fast you can solve a problem is very important.
There is a race against time in all the competitions. Only those people having fast
calculation ability will be able to win the race. Time saved can be used to solve more
problems or used for difficult problems. Given the initial training in modern maths in
today’s schools, students will be able to comprehend the logic of Vedic mathematics after
they have reached the 8th standard. It will be of interest to everyone but more so to
younger students keen to make their mark in competitive entrance exams. India’s past
could well help them make it in today’s world. It is amazing how with the help of 16
Sutras and 13 sub-sutras, the Vedic seers were able to mentally calculate complex
mathematical problems.
Vedic Mathematics provides unique solution for this problem. The Urdhva-
Tiryagbhyam Sutra or vertically and crosswise algorithm for multiplication is discussed
and then used to develop digital multiplier architecture. For division, different division
algorithms are studied, by comparing drawbacks and advantages of each algorithm,
Paravartya Algorithm based on Vedic mathematics is modified according to need and
then used. Many engineering application areas use this Vedic Mathematics, especially in
signal processing. It describes 16 sutras and sub-sutras which cover all the branches of
mathematics such as arithmetic, algebra, geometry, trigonometry, statistics etc.
Implementation of these algorithms in processors has found out to be advantageous in
terms of reduction in power and area along with considerable increase in speed
requirements. These Sutras are given in Vedas centuries ago. To be specific, these sutras
are described in ATHARVA-VEDA. The sutras and sub-sutras were reintroduced to the
world by Swami Bharati Krishna Tirthaji Maharaja in the form of book Vedic
Mathematics.
2.1 WORKING OF URDHVA-TIRYAK SUTRA
In Vedic Mathematics Sixteen sutras and sub-sutras are available to solve

mathematical problems. Some of these sutras are applicable to specific problems and
some sutras are for generalized problem. Urdhva-Tiryak sutra in Vedic Mathematics is a
generalized sutra which can be applied for multiplication of two numbers having p digits.
The meaning of Urdhva-Tiryak sutra is simple: ”Urdhva means Vertically and Tiryak
means Cross-wise”. This sutra has manifold applications; multiplication of numbers is
one such application.
4
Example of multiplication of two numbers by Urdhva-Tiryak sutra is shown .
Here, multiplication of two numbers 12 and 14 is shown. First multiply 2 by 4 to get 8 as
the answer. This process is known as Urdhva - Vertically. The answer obtained by this
process is to be placed in ”Ones” digit place in the final answer .Next 1 is multiplied by 4
and 1 is multiplied by 2 to get the answer 6. This process is known as Tiryak - Cross-
wise. The answer obtained by this process is to be placed in ”Tens” digit place in the
answer as shown in Fig 2.1 (b). Next is to multiply 1 by 1. This is again the Urdhva -
Vertically process. The product of this process is to be placed in ”Hundreds” digit place
in the answer as shown in Fig 2.1 (c). Hence final answer of multiplication of 12 and 14
is 168. In this way Urdhva-Tiryak sutra gives one line answer following a mental
multiplication of smaller digit number and addition. Urdhva-Tiryak sutra is applicable to
two numbers having p digits. Let us represent it in notation form as N2Dp meaning N = 2
numbers are to be multiplied having D = p digits. Where N represents numbers and D
represents digits.
Fig. 2.1 Multiplication of two numbers by Urdhva-Tiryak sutra
2.2 SIMPLE & EASY SYSTEM
Practitioners of this striking method of mathematical problem-solving opine that

Vedic maths is far more systematic, coherent and unified than the conventional system. It
is a mental tool for calculation that encourages the development and use of intuition and
innovation, while giving the student a lot of flexibility, fun and satisfaction. Therefore, it's
direct and easy to implement in schools - a reason behind its enormous popularity among
educationists and academicians.
G. Ramanjaneya Reddy and A. Srinivasulu [1] presented convolution process

using hardware computing and implementations of discrete linear convolution of two
finite length sequences (NXN). This implementation method is realized by simplifying
the convolution building blocks.
5
The purpose of this analysis is to prove the feasibility of an FPGA that performs a
convolution on an acquired image in real time. The proposed implementation uses a
changed hierarchical design approach, which efficiently and accurately quickens
computation. The efficiency of the proposed convolution circuit is tested by embedding it
during a prime level FPGA. It additionally provides the required modularity,
expandability, and regularity to form different convolutions. This particular model has the
advantage of being fine tuned for signal processing; in this case it uses the mean squared
error measurement and objective measures of enhancement to achieve a more effective
signal processing model. They have coded their design using the Verilog hardware
description language and have synthesized it for FPGA products using ISE, Modelsim
and DC compiler for other processor usage.
Mohammed Hasmat Ali, Anil Kumar Sahani [2], presented the detailed study of
different multipliers based on Array Multiplier, Constant coefficient multiplication
(KCM) and multiplication based on vedic mathematics. Multiplication-based operations
such as Multiply and Accumulate(MAC) and inner product are among some of the
frequently used Computation Intensive Arithmetic Functions(CIAF) currently
implemented in many Digital Signal Processing (DSP) applications such as convolution,
Fast Fourier Transform(FFT), filtering and in ALU of microprocessors. Since
multiplication dominates the execution time of most DSP algorithms, so there is a need of
high speed multiplier. All these multipliers are coded in Verilog HDL (Hardware
Description Language) and simulated in ModelSimXEIII6.4b and synthesized in EDA
tool Xilinx_ISE12.As multiplication dominates the execution time of the most Digital
Signal Processing algorithms, so there is a need of high speed multiplier. The demand for
high speed processing has been increasing as a result of expanding computer and signal
processing applications. Higher throughput arithmetic operations are important to achieve
the desired performance in many real-time signal and image processing applications. One
of the key arithmetic operations in such applications is multiplication and the
development of fast multiplier circuit has been a subject of interest over decades. This
paper presented the study of different multipliers.
6
2.3 CONVENTIONAL VS. VEDIC MULTIPLICATION SCHEME
The Vedic mathematics is the ancient system of mathematics which has a unique
technique for fast mental calculations, based on 16 sutras [6]. This approach is completely
different from other multiplication algorithms and considered very close to the way a
human mind works. Any ordinary human can perform mental operations for very small
magnitude of numbers and hence Vedic mathematics provides techniques to solve
operations with large magnitude of numbers easily. It covers explanation of several
modern mathematical terms including arithmetic, trigonometry, plain, calculus, quadratic
equations, factorization and spherical geometry. In [5], author has presented a hierarchical
implementation of multiplication based on an array of array technique. This multiplier
architecture is based on generating all partial products and their sums. We have
considered this architecture name as HAOM.The author claim that HAOM his faster than
array multipliers and booth multipliers.
Conventional method Vedic method
6 5 × 6 5 6 5 × 6 5=4 2 2 5
6 5
× 6 5 (Multiply the previous
3 2 5 digit ‘6’ by one more
3 9 0 × than itself ‘7’ then
4 2 2 5 write’25’)
2.4 RESULTS AND COMPARISON
A comparison of the processing times for Vedic and conventional mathematical

methods in the case of two- and three-digit multiplications reveals the details listed
below. Two-digit multiplication yields the results shown in Table 1. As evidenced from
Table 1, a time saving of approximately 52% can be achieved using the Vedic method.
Three-digit multiplication gives the results shown in Table 2. In the case of three-digit
multiplication,
7
Approximately 39% of the processing time is saved. Similar results can be
obtained on other processors as well. The above results are extremely encouraging so far
as a digital signal processing (DSP) are concerned. Most of the important DSP
algorithms, such as convolution, discrete Fourier transforms, fast Fourier transforms,
digital filters, etc. incorporate multiply-accumulate computations [7]. Since the
multiplication time is generally far greater than the addition time, the total processing
time for any DSP algorithm primarily depends upon the number of multiplications.
Table 1: Two-digit multiplication.
Table 2: Three-digit multiplication
8
CHAPTER-3
VLSI
Very-large-scale integration (VLSI) is the process of creating integrated circuits
by combining thousands of transistor-based circuits into a single chip. VLSI began in the
1970s when complex semiconductor and communication technologies were being
developed. The microprocessor is a VLSI device. The term is no longer as common as it
once was, as chips have increased in complexity into the hundreds of millions of
transistors.
3.1 OVERVIEW
The first semiconductor chips held one transistor each. Subsequent advances added
more and more transistors, and, as a consequence, more individual functions or systems
were integrated over time. The first integrated circuits held only a few devices, perhaps as
many as ten diodes, transistors, resistors and capacitors, making it possible to fabricate
one or more logic gates on a single device. Now known retrospectively as "small-scale
integration" (SSI), improvements in technique led to devices with hundreds of logic gates,
known as large-scale integration (LSI), i.e. systems with at least a thousand logic gates.
Current technology has moved far past this mark and today's microprocessors have many
millions of gates and hundreds of millions of individual transistors.
At one time, there was an effort to name and calibrate various levels of large-scale
integration above VLSI. Terms like Ultra-large-scale Integration (ULSI) were used. But
the huge number of gates and transistors available on common devices has rendered such
fine distinctions moot. Terms suggesting greater than VLSI levels of integration are no
longer in widespread use. Even VLSI is now somewhat quaint, given the common
assumption that all microprocessors are VLSI or better.
As of early 2008, billion-transistor processors are commercially available, an

example of which is Intel's Montecito Itanium chip. This is expected to become more
commonplace as semiconductor fabrication moves from the current generation of 65 nm
processes to the next 45 nm generations (while experiencing new challenges such as
increased variation across process corners). Another notable example is NVIDIA’s 280
series GPU.
9
This microprocessor is unique in the fact that its 1.4 Billion transistor count,
capable of a teraflop of performance, is almost entirely dedicated to logic (Itanium's
transistor count is largely due to the 24MB L3 cache). Current designs, as opposed to the
earliest devices, use extensive design automation and automated logic synthesis to lay out
the transistors, enabling higher levels of complexity in the resulting logic functionality.
Certain high-performance logic blocks like the SRAM cell, however, are still designed by
hand to ensure the highest efficiency (sometimes by bending or breaking established
design rules to obtain the last bit of performance by trading stability).
3.2 ABOUT VLSI

VLSI stands for "Very Large Scale Integration". This is the field which involves
packing more and more logic devices into smaller and smaller areas.
Advantages of VLSI are:

 An Integrated circuit comprises of many transistors on one chip.
 Design/fabrication of extremely small, complex circuitry using appropriate

semiconductor material
 Integrated circuit (IC) may contain millions of transistors, each a few µm in size
 Applications wide ranging: most electronic logic devices
3.3 HISTORY OF SCALE INTEGRATION

1) Late 40s Transistor invented at Bell Labs
2) Late 50s First IC (JK-FF by Jack Kilby at TI)
3) Early 60s Small Scale Integration (SSI), 10s of transistors on a chip
4) Late 60s Medium Scale Integration (MSI), 100s of transistors on a chip
5) Early 70s Large Scale Integration (LSI), 1000s of transistor on a chip
6) Early 80s VLSI 10,000s of transistors on a chip,(later 100,000s & now 1,000,000s)
SSI - Small-Scale Integration (0-102)
MSI - Medium-Scale Integration (102-103)
LSI - Large-Scale Integration (103-105)
LSI - Very Large-Scale Integration (105-107)
ULSI-Ultra Large-Scale Integration (>=107)
10
3.4 ADVANTAGES OF ICS OVER DISCRETE COMPONENTS
While we will concentrate on integrated circuits, the properties of integrated

circuits-what we can and cannot efficiently put in an integrated circuit-largely determine
the architecture of the entire system. Integrated circuits improve system characteristics in
several critical ways. ICs have three key advantages over digital circuits built from
discrete components:
 Size: Integrated circuits are much smaller-both transistors and wires are
shrunk to micrometer sizes, compared to the millimeter or centimeter
scales of discrete components. Small size leads to advantages in speed and
power consumption, since smaller components have smaller parasitic
resistances, capacitances, and inductances.
 Speed: Signals can be switched between logic 0 and logic 1 much quicker
within a chip than they can between chips. Communication within a chip
can occur hundreds of times faster than communication between chips on a
printed circuit board. The high speed of circuits on-chip is due to their
small size-smaller components and wires have smaller parasitic
capacitances to slow down the signal.
 Power Consumption : Logic operations within a chip also take much less
power. Once again, lower power consumption is largely due to the small
size of circuits on the chip-smaller parasitic capacitances and resistances
require less power to drive them.
3.5 VLSI AND SYSTEMS
These advantages of integrated circuits translate into advantages at the system level:
 Smaller physical size: Smallness is often an advantage in itself-consider portable

televisions or handheld cellular telephones.
 Lower power consumption: Replacing a handful of standard parts with a single
chip reduces total power consumption. Reducing power consumption has a ripple
effect on the rest of the system: a smaller, cheaper power supply can be used;
since less power consumption means less heat, a fan may no longer be necessary;
11
a simpler cabinet with less shielding for electromagnetic shielding may be
feasible, too.
 Reduced cost: Reducing the number of components, the power supply
requirements, cabinet costs, and so on, will inevitably reduce system cost. The
ripple effect of integration is such that the cost of a system built from custom ICs
can be less, even though the individual ICs cost more than the standard parts they
replace.
Understanding why integrated circuit technology has such profound influence on

the design of digital systems requires understanding both the technology of IC
manufacturing and the economics of ICs and digital systems.
3.6 APPLICATIONS OF VLSI
 Electronic system in cars.

 Digital electronics control VCRs
 Transaction processing system, ATM
 Personal computers and Workstations
 Medical electronic systems etc…
Electronic systems now perform a wide variety of tasks in daily life. Electronic
systems in some cases have replaced mechanisms that operated mechanically,
hydraulically, or by other means; electronics are usually smaller, more flexible, and easier
to service. In other cases electronic systems have created totally new applications.
Electronic systems perform a variety of tasks, some of them visible, some more hidden:
 Personal entertainment systems such as portable MP3 players and DVD players
perform sophisticated algorithms with remarkably little energy.
 Electronic systems in cars operate stereo systems and displays; they also control
fuel injection systems, adjust suspensions to varying terrain, and perform the
control functions required for anti-lock braking (ABS) systems.
 Digital electronics compress and decompress video, even at high-definition data
rates, on-the-fly in consumer electronics.
 Low-cost terminals for Web browsing still require sophisticated electronics,
despite their dedicated function.
12
 Personal computers and workstations provide word-processing, financial analysis,
and games. Computers include both central processing units (CPUs) and special-
purpose hardware for disk access, faster screen display, etc.
 Medical electronic systems measure bodily functions and perform complex
processing algorithms to warn about unusual conditions. The availability of these
complex systems, far from overwhelming consumers, only creates demand for
even more complex systems.
The growing sophistication of applications continually pushes the design and

manufacturing of integrated circuits and electronic systems to new levels of complexity.
And perhaps the most amazing characteristic of this collection of systems is its variety-as
systems become more complex, we build not a few general-purpose computers but an
ever wider range of special-purpose systems. Our ability to do so is a testament to our
growing mastery of both integrated circuit manufacturing and design, but the increasing
demands of customers continue to test the limits of design and manufacturing
3.7 ASIC
An Application-Specific Integrated Circuit (ASIC) is an integrated circuit (IC)

customized for a particular use, rather than intended for general-purpose use. For
example, a chip designed solely to run a cell phone is an ASIC. Intermediate between
ASICs and industry standard integrated circuits, like the 7400 or the 4000 series, are
application specific standard products (ASSPs).
As feature sizes have shrunk and design tools improved over the years, the
maximum complexity (and hence functionality) possible in an ASIC has grown from
5,000 gates to over 100 million. Modern ASICs often include entire 32-bit processors,
memory blocks including ROM, RAM, EEPROM, Flash and other large building blocks.
Such an ASIC is often termed a SoC (system-on-a-chip). Designers of digital ASICs use a
hardware description language (HDL), such as Verilog or VHDL, to describe the
functionality of ASICs.
Field-programmable gate arrays (FPGA) are the modern-day technology for

building a breadboard or prototype from standard parts; programmable logic blocks and
programmable interconnects allow the same FPGA to be used in many different
applications. For smaller designs and/or lower production volumes, FPGAs may be more
cost effective than an ASIC design even in production.
13
CHAPTER-4
INTRODUCTION TO VERILOG
Verilog is a Hardware Description Language; a textual format for describing
electronic circuits and systems. Applied to electronic design, Verilog is intended to be
used for verification through simulation, for timing analysis, for test analysis (testability
analysis and fault grading) and for logic synthesis.
The Verilog HDL is an IEEE standard - number 1364. The first version of the
IEEE standard for Verilog was published in 1995. A revised version was published in
2001; this is the version used by most Verilog users. The IEEE Verilog standard
document is known as the Language Reference Manual, or LRM. This is the complete
authoritative definition of the Verilog HDL.
A further revision of the Verilog standard was published in 2005, though it has
little extra compared to the 2001 standard. SystemVerilog is a huge set of extensions to
Verilog, and was first published as an IEEE standard in 2005. See the appropriate
Knowhow section for more details about SystemVerilog.
IEEE Std 1364 also defines the Programming Language Interface, or PLI. This is
a collection of software routines which permit a bidirectional interface between Verilog
and other languages (usually C).
Note that VHDL is not an abbreviation for Verilog HDL - Verilog and VHDL are two
different HDLs. They have more similarities than differences, however.
4.1 A BRIEF HISTORY OF VERILOG
Design Automation developed a logic simulator, Verilog-XL, and with it a

hardware description language.
Cadence Design Systems acquired Gateway in 1989, and with it the rights to the
language and the simulator. In 1990, Cadence put the language (but not the simulator)
into the public domain, with the intention that it should become a standard, non-
proprietary language.
14
The Verilog HDL is now maintained by a non profit making organisation,
Accellera, which was formed from the merger of Open Verilog International (OVI) and
VHDL International. OVI had the task of taking the language through the IEEE
standardisation procedure.
In December 1995 Verilog HDL became IEEE Std. 1364-1995. A significantly

revised version was published in 2001: IEEE Std. 1364-2001. There was a further
revision in 2005 but this only added a few minor changes.
Accellera have also developed a new standard, SystemVerilog, which extends

Verilog. SystemVerilog became an IEEE standard (1800-2005) in 2005. For more details,
see the Systemverilog section of KnowHow
There is also a draft standard for analog and mixed-signal extensions to Verilog,
Verilog-AMS.
4.2 DESIGN FLOW USING VERILOG
The diagram below summarises the high level design flow for an ASIC (ie. gate
array, standard cell) or FPGA. In a practical design situation, each step described in the
following sections may be split into several smaller steps, and parts of the design flow
will be iterated as errors are uncovered.
15
System Aanalysis and Partitioning
Write verilog code write test cases
Per block in verilog
Simulate the verilog
Code
Synthesize to Gates
Verification is now possible
Simulate at Gate level before gates
Program or make device
Fig 4.1 Design flow
4.3 SYSTEM LEVEL VERIFICATION
As a first step, Verilog may be used to model and simulate aspects of the complete
system containing one or more ASICs or FPGAs. This may be a fully functional
description of the system allowing the specification to be validated prior to commencing
detailed design. Alternatively, this may be a partial description that abstracts certain
properties of the system, such as a performance model to detect system performance
bottle-necks.
16
Verilog is not ideally suited to system-level modelling. This is one motivation for
SystemVerilog, which enhances Verilog in this area.
4.4 RTL DESIGN AND TESTBENCH CREATION
Once the overall system architecture and partitioning is stable, the detailed design
of each ASIC or FPGA can commence. This starts by capturing the design in Verilog at
the register transfer level, and capturing a set of test cases in Verilog. These two tasks are
complementary, and are sometimes performed by different design teams in isolation to
ensure that the specification is correctly interpreted. The RTL Verilog should be
synthesizable if automatic logic synthesis is to be used. Test case generation is a major
task that requires a disciplined approach and much engineering ingenuity: the quality of
the final ASIC or FPGA depends on the coverage of these test cases.
For today's large, complex designs, verification can be a real bottleneck. This
provides another motivation for SystemVerilog - it has features for expediting testbench
development. See the SystemVerilog section of Knowhow for more details.
4.5 RTL VERIFICATION
The RTL Verilog is then simulated to validate the functionality against the
specification. RTL simulation is usually one or two orders of magnitude faster than gate
level simulation, and experience has shown that this speed-up is best exploited by doing
more simulation, not spending less time on simulation.
In practice it is common to spend 70-80% of the design cycle writing and

simulating Verilog at and above the register transfer level, and 20-30% of the time
synthesizing and verifying the gates.
4.6 LOOK-AHEAD SYNTHESIS
Although some exploratory synthesis will be done early on in the design process,
to provide accurate speed and area data to aid in the evaluation of architectural decisions
and to check the engineer's understanding of how the Verilog will be synthesized, the
main synthesis production run is deferred until functional simulation is complete. It is
pointless to invest a lot of time and effort in synthesis until the functionality of the design
is validated.
17
4.7 LEVELS OF ABSTRACTION
Verilog descriptions can span multiple levels of abstraction i.e. levels of detail,
and can be used for different purposes at various stages in the design process
Stocastic Performance Modelling
System analysis
Algorithmic
And partitioing
Register transfer level Digital Design
Synthesis
Gate level
Gate level verification
Switch level
Fig 4.2 levels of abstraction
At the highest level, Verilog contains stochastical functions (queues and random
probability distributions) to support performance modelling.
Verilog supports abstract behavioural modeling, so can be used to model the

functionality of a system at a high level of abstraction. This is useful at the system
analysis and partitioning stage.
Verilog supports Register Transfer Level descriptions, which are used for the
detailed design of digital circuits. Synthesis tools transform RTL descriptions to gate
level.
18
Verilog supports gate and switch level descriptions, used for the verification of
digital designs, including gate and switch level logic simulation, static and dynamic
timing analysis, testability analysis and fault grading.
Verilog can also be used to describe simulation environments; test vectors,

expected results, results comparison and analysis. With some tools, Verilog can be used
to control simulation e.g. setting breakpoints, taking checkpoints, restarting from time 0,
tracing waveforms. However, most of these functions are not included in the 1364
standard, but are proprietary to particular simulators. Most simulators have their own
command languages; with many tools this is based on Tcl, which is an industry-standard
tool language.
19
CHAPTER-5
CONVOLUTION AND DECONVOLUTION

5.1 CONVOLUTION
Convolution is considered to be heart of the digital signal processing. It is the

mathematical way of combining two signals to obtain a third signal. Linear systems
characteristics are completely specified by the systems impulse response, as governed
bythe mathematics of convolution. Convolution is an operation which takes two functions
as input, and produces a single function output (much like addition or multiplication of
functions).Consider two finite length sequences f(n) and h(n) on which the convolution
operation is to be performed with lengths l and m respectively. The output of convolution
operation y(n) contains l+m-1 number of samples.
The linear convolution of f(n) and h(n) is given by :
Y(n)=x(n)*h(n)
∞
Y[n]=∑ x ( k ) h(n−k )
−∞
The impulse response goes by a different name in some applications. If the system
being considered is a filter, the impulse response is called the filter kernel, the
convolution kernel, or simply, the kernel. In image processing, the impulse response is
called the point spread function. While these terms are used in slightly different ways,
they all mean the same thing, the signal produced by a system when the input is a delta
function.
20
21
Fig 5.1 The sequences f(n) and g(n), shown in (a), are graphically convolved
in (b), resulting in the sequence y(n), shown in (c)
5.2 DECONVOLUTION
If the impulse response and the output of a system are known, then the procedure
to obtain the unknown input is referred to as deconvolution. The concept of
deconvolution is also widely used in the techniques of signal processing and image
processing. In general, the object of deconvolution is to find the solution of a convolution
equation of the form:
x*h = y
Usually, y is some recorded signal, and x is some signal that wish to recover, but
has been convolved with some other signal h before get recorded. The function h might
represent the transfer function of an instrument or a driving force that was applied to a
physical system. If one Know h or at least form of h, then one can perform deterministic
deconvolution. If the two sequences x(n) and h (n) are causal, then the convolution sum is
22
n
y[n]=∑ x ( k ) h(n−k )
k=0
Therefore, solving for x(n) given h(n) and y(n) results in

n−1
y ( n )−∑ x ( k ) h(n−k )
x(n) = k=0
h(n)
Where,
y (0)
x(0) =
h(0)
5.3 CONVENTIONAL MULTIPLIER
An array multiplier is a digital combinational circuit that is used for the

multiplication of two binary numbers by employing an array of full adders and half
adders. This array is used for the nearly simultaneous addition of the various product
terms involved. To form the various product terms, an array of AND gates is used before
the Adder array. To clarify more on the concept, let us consider a 4×4 bit multiplication
with A and B being the multiplicand and the multiplier respectively. Assuming A = (1 0 0
1) and B= (1 0 1 0), the various bits of the final product can be written as:-
(1 0 1 1×1 1 0 1) = 1 0 0 0 1 1 1 1
1 0 1 1
1 1 0 1 ×
1 0 1 1
0 0 0 0 Left shift by 1 bit
1 0 0 0 1 1 1 1
Fig 5.2: Example of conventional multiplier
For the above multiplication, an array of sixteen AND gates is required to form
the various product terms and an Adder array is required to calculate the sums involving
23
the various product terms and carry combinations in order to get the final Product bits.
The Hardware requirement for an m x n bit array multiplier is given as:-
(m x n) AND gates,
(m-1).n Adders in which n HA(Half Adders) and
(m-2).n FA(full adders).
Here from the above example it is inferred that partial products are generated
sequentially, which reduces the speed of the multiplier. However the structure of the
multiplier is regular. Also, in multiplier worst case delay would be (2n+1) td. Multiplier
gives more power consumption as well as optimum number of components required, but
delay for this multiplier is larger. It also requires larger number of gates because of which
area is also increased; due to this multiplier is less economical.
5.4 BLOCK DIAGRAM
A block diagram of proposed system is shown in Figure 2. It consists of multiplier

based on vedic sutra i.e.Urdhva-Tiryagbhyam that are embedded into convolution of two
finite sequence and vedic divider that are embedded in deconvolution process to recover
the original data.System block diagram is shown in Figure 2.
Primary requirement of any application to work fast is that increase the speed of
their basic building block. Multiplier and Divider is the heart of convolution and
deconvolution respectively as shown in above fig. It is most important but, slowest unit of
the system and consumes much time in the system. Many methods are invented to
improve the speed of the Multiplier and Divider, amongst all Vedic Multiplier and
Divider is under focus because, of faster working and low power consumption. In this
project the speed of Convolution and Deconvolution module is improved using Vedic
multiplier and Divider. It consists of multiplier based on vedic sutra i.e. URDHVA
Tiryagbhyam that are embedded into convolution of two finite sequence and
divider based on Vedic sutra i.e. Paravartya Sutra that are embedded in deconvolution
process to recover the original data.
24
Input Sequence Impulse response
N-bit Multiplicand Multiplier

Architecture
Convolution of
(n by n
input data block
multiply
x[n] and h[n]
N-bit Multiplier operation)
Final sequence
Y[n]=x[n]*h[n]
Deconvolution of
Division y(n)
N-bit divider
Architecture
N-bit dividend
Final output as
x(n)
25 X(n)=y(n)h(n)
Fig 5.3 System Block Diagram
Consider 4x4 multiplications, say A= A3 A2 A1 A0 and B= B3 B2 B1 B0. The

output line for this multiplication is P7 P6 P5P4P3P2 P1 P0. Using the fundamental of
Array Multiplication, taking partial product addition is carried out in Carry save form; we
can have the following structure for multiplication.
Fig 5.4 Structure for conventional multiplier
Urdhva Tiryagbhyam Sutra is a general multiplication formula applicable to all

cases of multiplication. It literally means “Vertically and crosswise”. It is based on a
novel concept through which the generation of all partial products can be done with the
concurrent addition of these partial products. Since the partial products and their sums are
calculated in parallel, the multiplier is independent of the clock frequency of the
processor. Thus the multiplier will require the same amount of time to calculate the
product and hence is independent of the clock frequency. The net advantage is that it
reduces the need of microprocessors to operate at increasingly high clock frequencies.
While a higher clock frequency generally results in increased processing power, its
disadvantage is that it also increases power dissipation which results in higher device
operating temperatures. By adopting the Vedic multiplier, microprocessors designers can
easily circumvent these problems to avoid catastrophic device failures.Due to its regular
structure, it can be easily layout in a silicon chip. The Multiplier has the advantage that as
26
the number of bits increases, gate delay and area increases very slowly as compared to
other multipliers. Therefore it is time, space and power efficient. The main advantage of
the Vedic Multiplication algorithm (Urdhva Tiryagbhyam Sutra) stems from the fact that
it can be easily implemented in FPGA due to its simplicity and regularity.
5.5 VEDIC MULTIPLIER
Vedic mathematics is part of four Vedas (books of wisdom). It is part of

Sthapatya-Veda (book on civil engineering and architecture), which is an upa-veda
(supplement) of Atharva Veda. It gives explanation of several mathematical terms
including arithmetic, geometry (plane, co-ordinate), trigonometry, quadratic equations,
factorization and even calculus. His Holiness Jagadguru Shankaracharya Bharati Krishna
Teerthaji Maharaja (1884-1960) comprised all this work together and gave its
mathematical explanation while discussing it for various applications. The sutras and sub-
sutras were reintroduced to the world by Swami Bharati Krishna Tirthaji Maharaja in the
form of book Vedic Ancient Indian.
The work presented here, makes use of Vedic Mathematics. “Urdhva

Tiryagbhyam Sutra” or “Vertically and Crosswise Algorithm” of Vedic mathematics for
multiplication I digital multiplier architecture. The multiplication of two 4 bit number
using Urdhva Tiryagbhyam.
Step1: Step2: Step3: Step4:
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
Step5: Step6: Step7:
1 1 0 1 1 1 0 1 1 1 0 1
1 0 1 0 1 0 1 0 1 0 1 0
27
Fig.5.5 Multiplication of two 4 bit numbers using Urdhva Tiryagbhyam method
Thus, the product terms can be calculated as
A1=a0*b0
A2=a0*b1+a1*b0+prevcarry
A3=a0*b2+a1*b1+a2*b0+prevcarry
A4=a0*b3+a1*b2+a2*b1+a3*b0+prevcarry
A5=a1*b3+a2*b2+a3*b1+prevcarry
A6=a2*b3+a3*b2 + prevcarry
A7=a3*b3+ prevcarry
After comparative study of conventional multipliers and vedic multiplier, Urdhva

Tiryagbhyam sutra is shown to be an efficient multiplication algorithm.Therefore we are
using vedic multiplier in convolution system to improve its performance.
The use of Vedic Mathematics lies in the fact that it reduces the typical
calculations in conventional mathematics to very simple ones. Urdhva Tiryagbhyam Sutra
is a general multiplication formula applicable to all cases of multiplication Because of
parallelism in generation of partial products and their summation obtained, speed is
improved. In this algorithm the small block can be wisely utilized for designing bigger
NxN multiplier. For higher no. of bits in input, little modification is required. Divide the
no. of bit in the inputs equally in two parts. Let’s analyse 4x4 multiplications, say
A3A2A1A0 and B3B2B1B0. Following are the output line for the multiplication result
S7S6S5S4S3S2S1S0. Let’s divide A and B into two parts, say A3A2&A1A0 for A and
B3B2&B1B0 for B. Using the fundamental of Vedic multiplication, taking two bit at a
time and using 2 bit multiplier block, we can have the following structure for
multiplication.
28
Fig 5.6 Block diagram presentation for 4x4 multiplications
Each block as shown above is 2x2 multiplier. First 2x2 multiplier inputs are A1A0
and B1B0. The last block is 2x2 multiplier with inputs A3A2 and B3B2. The middle one
shows two, 2x2 multiplier with inputs A3A2 and B1B0 and A1A0 and B3B2. So the final
result of multiplication, which is of 8 bit, S7S6S5S4S3S2S1S0, can be interpreted as
given below.
Assuming the output of each multiplication is as given above. For the final result,
add the middle product term along with the term shown below.
29
Result of addition of the middle terms by using two, 4 bit full adders will forms
output line from S5S4S3S2. One of the full adder will be used to add (S23S22S21S20)
and (S13S12S11S10) and then the second full adder is required to add the result of 1st
full adder with (S31S30S03S02). The respective sum bit of the 2nd full adder will be
S5S4S3S2. Now the carry generated during 1st full adder operation and that during 2nd
full adder operation should be added using half adder so that the final carry and sum to be
added with next stage i.e. with S33S32 to get S7S6. The same can be extended for input
bits 8, 16, 32.
The hardware architecture of 2X2, 4x4 and 8x8 bit Vedic multiplier module are
displayed in the below sections. Here, “Urdhva-Tiryagbhyam” (Vertically and Crosswise)
sutra is used to propose such architecture for the multiplication of two binary numbers.
The beauty of Vedic multiplier is that here partial product generation and additions are
done concurrently. Hence, it is well adapted to parallel processing. The feature makes it
more attractive for binary multiplications. This in turn reduces delay, which is the
primary motivation behind this work.
The 2X2 Vedic multiplier module is implemented using four input AND gates &
two half-adders which is displayed in its block diagram in Fig. 3. It is found that the
hardware architecture of 2x2 bit Vedic multiplier is same as the hardware architecture of
2x2 bit conventional Array Multiplier [2]. Hence it is concluded that multiplication of 2
bit binary numbers by Vedic method does not made significant effect in improvement of
the multiplier‟s efficiency. Very precisely we can state that the total delay is only 2-half
adder delays, after final bit products are generated, which is very similar to Array
multiplier. So we switch over to the implementation of 4x4 bit Vedic multiplier which
uses the 2x2 bit multiplier as a basic building block. The same method can be extended
for input bits 4 & 8. But for higher no. of bits in input, little modification is required.
30
Fig 5.7 Block Diagram of 2x2 bit Vedic Multiplier
31
CHAPTER- 6
EXISTING METHOD
6.1 WALLACE TREE MULTIPLIER
Multipliers have gained the significant importance with the introduction of the
digital computers. Multipliers are most often used in digital signal processing applications
and microprocessors designs. In contrast to process of addition and subtraction,
multipliers consume more time and more hardware resources. With the recent advances in
technology, a number of multiplication techniques have been implemented for fulfilling
the requirement of producing high speed, low power consumption, less area or a
combination of them in one multiplier. Speed and area are the two major constraints
which conflict each other. Therefore, it is the designer’s task to decide proper balance in
selecting an appropriate multiplication technique as per requirements. Parallel multipliers
are the high speed multipliers. Therefore, the enhanced speed of the multiplication
operation is achieved using various schemes and Wallace tree is one of them [1].
There are three phases in the multiplier architecture:
1. The first phase is the generation of partial products;
2. Accumulation of partial product in second phase; and
3. The third phase is the final addition phase.
A fast process for multiplication of two numbers was developed by Wallace [7].
Using this method, a three step process is used to multiply two numbers; the bit products
are formed, the bit product matrix is reduced to a two row matrix where sum of the row
equals the sum of bit products, and the two resulting rows are summed with a fast adder
to produce a final product. In the Wallace Tree method, three bit signals are passed to a
one bit full adder (“3W”) which is called a three input Wallace Tree circuit, and the
output signal (sum signal) is supplied to the next stage full adder of the same bit, and the
carry output signal thereof is passed to the next stage full adder of the same no of bit, and
the carry output signal thereof is supplied to the next stage of the full adder located at a
one bit higher position[5].
32
Fig 6.1 Logic used in 4 bit Wallace Tree Multiplier
Fig 6.2 Wallace Tree Multiplier
33
Fig 6.3 RTL Schematic of4 bit Wallace Tree Multiplier
Fig 6.4 Logic used in 8 bit Wallace Tree Multiplier
A 8-bit multiplier is constructed by using Wallace tree architecture .The

architecture has been shown in Figure 3. Partial products are added in 6 steps.
34
In the Wallace Tree method, the circuit layout is not easy although the speed of
the operation is high since the circuit is quite irregular [3]. The delay generated in wallace
tree multiplier can be further reduced by using modified tree structures called
compressors.
35
CHAPTER- 7
PROPOSED METHOD
7.1 PROPOSED CONVOLUTION
The linear convolution of x(n) and h(n) is y(n) = f(n) * g(n). This can be solved by
several methods, resulting in the sequence y(n). In this approach for calculating the
convolution sum is set up like multiplication (except carries are not performed out of a
column).where the convolution of x(n) and h(n) is performed as shown in fig. 5.5. To get
convolution of two sequences, where each sequence consist of 4 samples, sixteen partial
products are calculated and afterwards they are added to get convolution sequence y[n].
In this paper, Partial products are calculated by using vedic multiplier based on Urdhva
Tiryagbhyam algorithm. Here to minimize hardware, width of each input sample is
restricted to 4 bit. Hence maximum possible input sample value would be (1111)2 or
(15)10 or (F) h. Multiplier required is 4×4 bit. Each multiplier gives 8 bit long partial
product. Convolution outputs y[6] and y[0] are direct Partial products. While remaining
obtained after addition of intermediate partial products. Let two discrete length sequences
are x[n] and h[n].Where x[n] = {x3 x2 x1 x0} and h[n] = {h3 h2 h1 h0} are convolved.
As each sample is four bit long ,each partial product is eight bit long e.g. x0h0, x3h0,
x3h3 all are eight bit long. y[n] = x[n] * h[n], in a way as mentioned above. Procedure is
rearranged as shown
Fig 7.1 Convolution of x[n] and h[n]
36
In propose system to generate sixteen partial products, sixteen vedic multipliers
are used and to perform further operations of addition, all the outputs are latched
X0 h0 x1 h0 x2 h0 x3 h3
- - - - - - - - - - -- - - - - -
4×4 4×4 bit 4×4 bit 4×4 bit

-----
bitV.M.1 V.M.2 V.M.3 V.M.16
-------------
- -------------------
Combinational Logic
Y6 Y5 Y4 Y3 Y2 Y1 Y0
Conventional output
Fig 7.2 Block Diagram for convolution
37
4 bit long samples are applied to 4X4 bit vedic multipliers (V.M.).Output of each
Vedic multiplier is 8 bit long partial product. Vedic multiplier uses Urdhva Tiryagbhyam
algorithm for multiplication. In parallel processing, to generate sixteen partial products,
sixteen Vedic multipliers are used to boost speed. To perform further operation of
addition, all outputs are latched. And corresponding output Y0, Y1, Y2, Y3, Y4, Y5 and
Y6 are produced. Maximum possible length of Y0 and Y6 is 8 bit, while of Y1 toY5 is 9
bit. The design is built in VHDL and implemented on an FPGA.
7.2 CONVENTIONAL DIVIDER
Division is most complex and very time consuming if it is done straight forwardly,
because we need to compare the remainder with the divisor after every subtraction.
Basically division algorithm is classified as multiplicative and subtractive approaches.
Multiplicative division algorithms do not compute the quotient directly, but use
successive approximations to converge to the quotient. Normally, such algorithms only
yield a quotient, but with an additional step the final remainder can be computed, if
needed. Consider the following example, Assuming A=(11100110) and B=(110)
Fig 7.3 Example of conventional divider
Division is always considered to be bulky and one of the most difficult operations
in arithmetic and hence all the implementations of division algorithms in VLSI
architecture have higher orders of time and space complexities. Vedic Mathematics on the
other hand offers a new holistic approach to mathematics.
38
7.3 VEDIC DIVIDER
The word “Vedic” is derived from the word “Veda” which means the store-house
of all knowledge. Vedic mathematics is mainly based on sixteen sutras & from that
paravartya sutra is for division. Sanskrit term PARAYARTYA means all Transpose and
apply.
In this project a systematic method is used for division which based on Paravartya
Sutra. Paravartya Sutra help to minimize computation and maintain accuracy even as the
number of iteration is reduced. This complemented digit is initially multiplied with the
most significant digit of the dividend and this multiplication result is added with columns
of dividend.
The result of addition is again multiplied with complemented digits of Divisor and
added with the remaining column of the dividend, followed successive multiplication and
addition of consecutive column. The summation of all columns results forms quotient and
remainder. Implementation of the algorithm is illustrated using an example. Assume the
dividend is 1111 and divisor is 101. The division of this two numbers using paravartya
sutra is
Divisor Dividend
1 0 -1 1 1 1 1
-0 -1 -0 -1
-0 -1
1 1 0 0
Quotient=11 Remainder-00
Table 7.1 Division using paravartya sutra
In the proposed work we are optimizing the process of multiplication and

complement ,addition in the paravartya sutra to ANDing and substraction method
respectively. Thus the above example can be written as
39
Divisor Dividend
1 0 1 1 1 1 1
0 1 0 1
1 1
1 1 0 0
Quotient= 11 Remainder=00
Optimization in paravartya sutra
Table 7.3.1: Optimization in paravartya sutra
7.4 PROPOSED DECONVOLUTION
The linear deconvolution of two finite length sequences can be solved by several
method, in this approach for calculating the deconvolution ,deconvolution operation is set
up like long-hand division and polynomial division ,just as the propose convolution
method is similar to multiplication. In this approach division operation is implemented by
using Paravartya algorithm based on vedic mathematics while to obtain partial products
vedic multiplier is used.
To illustrate the method Consider the example 2, let Y(n) be the convolved
sequence equal to (8,38,77,80,49,18,3) and h(n) be the finite length sequence equal to
(2,7,9,3).
40
4 5 3 1
2 7 9 3 8 38 77 80 49 18 3
8 28 36 12
0 10 41 68 49 18 3
10 35 45 15
0 0 6 23 34 18 3
6 21 27 9
0 0 0 2 7 9 3
2 7 9 3
0 0 0 0 0 0
Fig 7.4 Proposed Deconvolution
In the above method division 8/2,10/2,6/3,2/2 is carried out using above

paravartya sutra and multiplication of 4*2,4*7 ,4*9,4*3 and so on are carried out using
vedic multiplier to recover the input sequence x(n)=(4 5 3 1).
41
CHAPTER-8
APPLICATIONS
 The main application of such system is in digital image processing as convolution

plays an important role in many algorithms in edge detection and related
processes.
 It helps in radiotherapy treatment planning systems, where most part of all modern
codes of calculation applies a convolution –superposition algorithm.
 Speeding up convolution and deconvolution using a Hardware Description
Language for design entry not only increases (improves) the level of abstraction,
but also opens new possibilities for using programmable devices.
 Any kind of application where high speed convolution and deconvolution
algorithm is needed.
8.1 ADVANTAGES
 Vedic mathematics enriches knowledge and understanding of mathematics ,which

shows clear links.
 Problems are reduced to one-line answers.
 Reduces dependence on calculators.
 Leads to improvement in mental ability,sharpness,creativity.
8.2 DISADVANTAGES
 For complex multiplication the system becomes complex.
42
CHAPTER-9
SIMULATIONS AND RESULTS
9.1 SIMUALTION OF CONVOLUTION
The vedic convolution algorithm proposed in this paper is simulated and

synthesised using the Xilinx Design Suit 14.2 with the device family as Spartan3 and
device XC3S400- 5fg320. The table 1 below shows the synthesis report of the proposed
work for convolution using vedic maths with the logic resource utilization.
TABLE 1. Device utilization summary for convolution
The simulated results of convolution and circular convolution are shown below:
Fig 9.1 Simulation results of Convolution using vedic mathematics
43
Fig. 9.2 Simulation results of Circular Convolution using Vedic Mathematics
9.2 SIMULATION OF DECONVOLUTION
The divider is simulated on Modelsim simulator. The figure 13 shows the

simulation result of divider. Dividend and divisor are the inputs of 4 bit each and quot
gives division result.
The simulated result of conventional divider is shown below
Fig 9.3 Simulation of divider
The Vedic divider is simulated on Modelsim simulator. The figure8.3 shows the
simulation result of Vedic divider. x and y are the inputs of 4 bit each and quot gives
division result based on paravartya sutra.
44
The simulated result of Vedic divider is shown below
Fig 9.4 Simulation of vedic divider
The Deconvolution module is simulated on Modelsim simulator. The figure 15

shows the simulation result of Deconvolution module using Vedic Mathematics. y0, y1,
y2, y3, y4, y5, y6 and g0, g1, g2, g3 are the inputs of 8 bit each and x0, x1, x2, x3 are the
outputs. The simulated result of Deconvolution using Vedic Divider is shown below
Fig 9.5 Simulation results of Deconvolution using Vedic mathematics
TABLE 2. Execution times for conventional method, ola method using vedic maths and
proposed method.
45
CONCLUSION
This propose system provides a method for calculating the linear convolution and
deconvolution with the help of vedic algorithms that is easy to learn and perform.It
presents faster implementation of linear convolution and deconvolution. The execution
time and area required for propose convolution and deconvolution using vedic
multiplication and division algorithm respectively compare with that of conventional
convolution and deconvolution with the simple multiplication and division is less from
the simulated result. The project, presents speedy implementation of linear convolution
and deconvolution. This particular model has the advantage of being fine tuned for any
signal processing application.
46
REFERENCES
[1] J. G. Proakis and D. G. Manolakis, “Digital Signal Processing: Principles, Algorithm,
and Applications,” 2nd Edition. New York Macmillan,1992.
[2] Pierre, John W. “A novel method for calculating the convolution sum of two finite
length sequences.” Education, IEEE Transactions on 39.1 (1996).
[3] Rashmi K. Lomte (Mrs.Rashmi R. Kulkarni), Prof.Bhaskar P.C, “High Speed

Convolution and Deconvolution Using Urdhva Triyagbhyam”. VLSI (ISVLSI), 2011
IEEE Computer Society Annual Symposium on. IEEE, 2011.
[4] Thapliyal, Himanshu, and M. B. Srinivas. ”High Speed Efficient NxN Bit Parallel H
Mathematics.” Enformatika Trans 2 (2004): 225-228.
[5] Senapati, Ratiranjan, Bandan Kumar Bhoi, and Manoranjan Pradhan ”Novel binary
divider architecture for high speed VLSI applications.” Information & Communication
Technologies (ICT), 2013 IEEE Conference on. IEEE, 2013.
47

Major

Uploaded by

Copyright:

Available Formats

Major

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Major

Uploaded by

Copyright:

Available Formats

CHAPTER-1

Therefore many of the researchers have been trying to improve performance

2.1 WORKING OF URDHVA-TIRYAK SUTRA

In Vedic Mathematics Sixteen sutras and sub-sutras are available to solve

Fig. 2.1 Multiplication of two numbers by Urdhva-Tiryak sutra

2.2 SIMPLE & EASY SYSTEM

Practitioners of this striking method of mathematical problem-solving opine that

G. Ramanjaneya Reddy and A. Srinivasulu [1] presented convolution process

Conventional method Vedic method

× 6 5 (Multiply the previous

3 2 5 digit ‘6’ by one more

3 9 0 × than itself ‘7’ then

2.4 RESULTS AND COMPARISON

A comparison of the processing times for Vedic and conventional mathematical

Table 1: Two-digit multiplication.

Table 2: Three-digit multiplication

As of early 2008, billion-transistor processors are commercially available, an

3.2 ABOUT VLSI

Advantages of VLSI are:

 Design/fabrication of extremely small, complex circuitry using appropriate

 Applications wide ranging: most electronic logic devices

3.3 HISTORY OF SCALE INTEGRATION

2) Late 50s First IC (JK-FF by Jack Kilby at TI)

3) Early 60s Small Scale Integration (SSI), 10s of transistors on a chip

4) Late 60s Medium Scale Integration (MSI), 100s of transistors on a chip

5) Early 70s Large Scale Integration (LSI), 1000s of transistor on a chip

SSI - Small-Scale Integration (0-102)

MSI - Medium-Scale Integration (102-103)

LSI - Large-Scale Integration (103-105)

LSI - Very Large-Scale Integration (105-107)

ULSI-Ultra Large-Scale Integration (>=107)

While we will concentrate on integrated circuits, the properties of integrated

3.5 VLSI AND SYSTEMS

 Smaller physical size: Smallness is often an advantage in itself-consider portable

Understanding why integrated circuit technology has such profound influence on

3.6 APPLICATIONS OF VLSI

 Electronic system in cars.

The growing sophistication of applications continually pushes the design and

An Application-Specific Integrated Circuit (ASIC) is an integrated circuit (IC)

Field-programmable gate arrays (FPGA) are the modern-day technology for

4.1 A BRIEF HISTORY OF VERILOG

Design Automation developed a logic simulator, Verilog-XL, and with it a

In December 1995 Verilog HDL became IEEE Std. 1364-1995. A significantly

Accellera have also developed a new standard, SystemVerilog, which extends

4.2 DESIGN FLOW USING VERILOG

Write verilog code write test cases

Per block in verilog

Simulate the verilog

Verification is now possible

Simulate at Gate level before gates

Program or make device

Fig 4.1 Design flow

4.3 SYSTEM LEVEL VERIFICATION

4.4 RTL DESIGN AND TESTBENCH CREATION

4.5 RTL VERIFICATION

In practice it is common to spend 70-80% of the design cycle writing and

4.6 LOOK-AHEAD SYNTHESIS

Stocastic Performance Modelling

Register transfer level Digital Design