Ijet17 09 05 301
Ijet17 09 05 301
Ijet17 09 05 301
ISSN (Online) : 0975-4024 Prabhat Chandra Shrivastava et al. / International Journal of Engineering and Technology (IJET)
II. PRELIMINARIES
The basic preliminaries and definitions are defined in this section.
A. The 2-D General model Filter:
The state space representation of 2-D discrete linear shift invariant General model [9] can be defined by
x(i 1, j 1) A1x (i, j 1) A2 x (i 1, j) A3 x (i, j) B1u (i, j 1) B2u (i 1, j) B3u (i, j) (1.1a)
y (i , j ) C x (i , j ) D u (i , j ) (1.1b)
i 0, j 0 (1.1c)
where x (i , j ) R n x1
is state vector, A 1 , A 2, A 3 R nxn
are the known system matrices,
B 1 , B 2, B 3 R nxm
are the known input matrices, u ( i , j ) R m x1
is the input vector, y (i, j ) is a scalar
output, C R 1x n
, and D R 1x m
. For simplicity, we have taken as m = 1 and n 2 throughout the manuscript.
The initial conditions associated with the system are that there exist two positive integers and such that,
x (i,0) 0, i r1, x (0, j) 0, j r2 (1.2)
The equilibrium x (i, j ) 0 of above systems is said to be globally asymptotically stable if
lim x(i, j ) lim x(i, j ) 0
i and / or j i j (1.3)
Transfer function for the General model is given as:
H ( z1 , z2 ) C ( z1 z2 I z2 A1 z1 A2 A3 )1 ( z2 B1 z1 B2 B3 ) D (1.4)
In (1.1), if B1 B2 0 and B3 B , then (1.1) become the Fornasini-Marchesini (FM) First 2-D state-space
model [6]. Similarly, When A 3 B 3 0 , then equation (1.1) become the FM second state-space model [8].
The block diagram of the 2-D General Model state space system is shown in figure 1. The block contains
multiplier, adder and shifter circuits which will be discussed in next section.
u (i , j 1) B1 A1 x (i , j 1)
x (i 1, j 1)
z1 1 z11
MAT MAT
MUL MUL
B2 A2
u (i 1, j ) x (i 1, j )
u(i 1, j 1)
z 2 1 z 2 1
MAT MAT
MUL MUL
z11
z11 B3 A3
MAT MAT
MUL MUL
u (i, j ) D
x (i, j )
C
MAT MAT
MUL MUL
y (i , j )
A 11
MUL
x 011 x 012 x 011
x 111
A 12 MUL
x 012 A 11
A 12 MUL
x 111
A 13
MUL
x 112 A 13
A 14 MUL
MUL A 14
x 112
Method-I in Fig 2(a), uses four multiplier and two adder circuits for operation. Multipliers are the most
important part of any MTMU, because they take most of the area and time for the operation. Method-II, as
shown in Fig. 2(b) utilizes lesser number of multiplier circuits for same operation at the cost of some small
additional circuits which are 2X1 multiplexers and accumulator. Due to simplicity and speed, method-I is very
suitable for proposed design. Finally, Fig. 2(c) shows a conventional MAC unit.
C. Processor Units (PEs)
In this section, we consider the architecture of processing element (PE) for the 2-D state-space system defined
by (1.1). The basic function of a PE is to calculate the next state vector x ( i 1, j 1) , for which it needs
multiple inputs. A simplified block diagram of a basic PE with state inputs is shown in Figure 3(a). To calculate
next state vector x ( i 1, j 1) in accordance with (1.1), a PE has to perform inner matrix multiplications and
matrix addition operations. This is shown in Figure 3(b). Thus, design problem is twofold: fist designing of
efficient matrix multiplication units (MTMU) and second is matrix additions unit (MAU). The design of
MMTUs is further complicated by the fact that a variety of matrix multiplication operation, due to the different
size of matrices in (1.1), is required to calculate the next state. As mentioned in (1.1) we assume m = 1 and n 2
, then we require an MMTU for performing multiplication of matrices with size 2 x 2 to 2 x 1 and another
MMTU to multiply the matrices of size 2 x 1 to 1 x 2 . In Figure 3(b), blocks identified as MAT MUL-1, MAT
MUL-2 and MAT MUL-3 (in blue color) requires first type of MMTU while the blocks identified as MAT
MUL-4, MAT MUL-5 and MAT MUL-6 (in green color) needs second type of MMTU. The output of both the
MMTUs is added in an MAU to generate the next state vector x ( i 1, j 1) , as shown in Figure 3(b). The
values of state vectors x ( i , j 1) , x ( i 1, j ) , x ( i , j ) will be updated after each processing stage so that
calculations at the next processing stage can be done correctly. Usually a systolic array technique, discussed in
the following section of the manuscript, is employed for this purpose.
A1
x (i 1, j 1)
B1
x (i , j 1) u (i , j 1)
A2 B2
x (i 1, j ) u (i 1, j )
A3 B3
x (i, j ) u (i, j )
(a) (b)
Fig. 3: (a) Block diagram of PE. (b) Internal architecture of processor elements (PE).
III.PROPOSED ARCHITECTURE
A simplified architecture for the 2-D state-space system described by the General model (1.1) is shown in
Figure 4. The whole architecture is divided three different parts namely: external input-data distributer, the
General model state-space system which is a systolic array of PEs (Figure 5) and a memory management unit
(MMU) for storing next state vectors. Each of these parts is now discussed in detail in following subsections.
A. External Input-Data Distributer:
The external input-data distributer (EIDD) hosts arrays of 8-bit shift registers. The task of EIDD is to receive
and store the known matrices A1 , A2 , A3 , B1 , B2 , B3 , C and D in linear manner, then to receive and store the
initial values of state vectors x ( i , j 1) , x ( i 1, j ) and x ( i , j ) , and finally to receive and store the values
of input vectors u ( i , j 1) , u ( i 1, j ) and u ( i , j ) , such that 0 i 7 and 0 j 7 . The values of
input vectors will be required for state space filter only. The values of A and B is connected directly to all the
processing stages through EIDD. The values of C and D are reserved to generate output y ( i , j ) according to
(1.1b), as and when required.
B. Systolic Array of PEs:
The second part of proposed architecture is linear systolic arrays of PEs. Systolic array provides an attractive
solution to mapping signal processing algorithms onto very large scale integrated circuit (VLSI) hardware. The
linear systolic array contains 7 x 7 PEs, each of which is combination of adder and multipliers as shown in
Figure 3(b). The number of PEs used in this architecture is equal to the multiplication of number of rows to
column in given systolic array. Due to its simple structure, localized communication between PEs, the systolic
array of PEs maybe increase up to N xN where N represents number of row/column in systolic array.
Furthermore, the communications between PEs and computations in PEs can be done at the same time without
interrupting the data flow. Since the systolic array of PEs should be initialized to zero, the output of the PE
should be connected to the corresponding entries directly. Neither delays nor other circuits are required to be
connected between two corresponding PEs as shown in Figure 5. Only the computed value of next state vector
x ( i 1, j 1) will move systolically from cell to cell while other inputs remains fixed for all PEs. Due to the
recursive property of proposed architecture, the state value at ( i 1, j 1) cannot be processed until the
instances ( i , j 1) , ( i 1, j ) and (i , j ) are available. E.g. x (1,1) depends upon x (0,1) , x (1, 0) and
x (0, 0) ; x (2,1) depends upon x (1,1) , x (2, 0) and x (1, 0) ; x (1, 2) depends upon x (0, 2) , x (1,1) and
x (0,1) etc.
Fig. 4: Proposed architecture for the 2-D General model state-space system
B1
k1
x (i , j 1)
A2
B2
k2
x (i 1, j )
x (i 1, j 1)
A3
B3
k3
x (i, j )
It is worth noting that the PE shown in Figure (6) is a simple extension of the PE shown in Figure (3b), where
input to various MMTUs in a PE is redefined. Hence, the architecture proposed in Figure (4) and Figure (5)
remains valid.
A simple inspection of the PE proposed in Figure (6) reveals that the dedicated architecture which can be
used only for the control applications. Whereas, it is desirable that depending upon the situations the hardware
architecture may be used for control applications or any other data processing applications.
V. UNIFIED STRUCTURE FOR DATA PROCESSING AND CONTROL:
We have mentioned in the previous section that the proposed structure with PEs in Fig. (6) is suitable only for
control application. Thus, in this section we consider a unified structure which is suitable for a variety of data
processing and control applications. A comparison between the architecture of PEs of Figure (3b) and Figure (6)
shows that difference lies in the inputs given to various MMTUs. In order to solve the issue in a simplified
manner we introduce a multiplexer which selects input that is feeded to MAT MUL-4, MAT MUL-5 and MAT
MUL-6 blocks of Figure (3b). This is shown in Figure (7). A control signal namely, controller/filter, is provided
in the structure for selecting the input data either for control applications or for signal processing applications.
When this control signal is LOW the structure effectively works as the one proposed in Figure (6) and when this
control signal is HIGH the structure works as the one proposed in Figure (3b). Even though, efficient realization
for the structure proposed in Figure (7) may be obtained, the authors have chosen the above structure to
maintain simplicity in realization. However, readers are encouraged to explore various other realizations which
are simple as well as efficient.
Type of
Timing (per unit) Area (per unit)
Architecture
State Space TTEIDD + 13(2T8MUL+72T2FA) 219A8MUX + 49(18A8MUL + 272A2FA)
System +TMMTU +128A8DFF
Controller TTEIDD + 13(2T8MUL+88T2FA) 219A8MUX + 49(30A8MUL + 448A2FA)
Structure +TMMTU +128A8DFF
Unified TTEIDD + 13(3T8MUL+72T2FA+ TMUX) 219A8MUX + 49(24A8MUL + 348A2FA+3
Structure +TMMTU A8MUX ) +128A8DFF
C. Synthesis Result of Proposed Architecture:
We have coded the proposed design and processing elements in Verilog HDL. Further, designs are
synthesized by Synopsis Design Compiler using 90-nm standard CMOS library. The word length of input
samples and weights are taken to be 8 bits. The details of synthesis results of processing elements and proposed
designs are depicted in TABLE-II and TABLE-III respectively.
From the TABLE-II, it can be observed that, the area of PEs for controller take more area, power and DAT
(data arrival time) than that of PE of unified structure, while unified structure are working for both state space
system as well as controller with feedback. The main region behind this is that, the PE of controller used twelve
8X8 and twelve 8X16 multiplier respectively together and the PE of unified structure used twenty four 8X8
multiplier.
TABLE-III shows the synthesis report of proposed structures at 10MHz frequency. From the results as shown
in TABLEs, the unified structure is more efficient as compare to other structures.
TABLE II (SYNTHESIS REPORT OF PROCESSING ELEMENT)
VII. CONCLUSION
In this paper the hardware realization problem of two dimensional linear general model state-spaces system
and controller have been proposed. The realization of this architecture has been approached from a system
theoretic point of view as well as hardware. Using this proposed scheme, we have derived a parallel architecture
for the implementation of linear systolic arrays with diagonal scanning methods has included. By using a linear
systolic array instead of single processor element, the time and throughput rate can be increased significantly. In
this paper, we have presented two other architectures such as controller for 2-D system and unified structure
which can be used either state space system or controller using some controlled signals.
The ASIC synthesis results shows the proposed unified structure for system/controller of order 7 has taken
approx 22% more area and 50% less area than state space system and controller with feedback respectively.
Finally, the proposed architecture is implemented and analysed using Verilog-HDL and Synopsis Design
Compiler with 90nm TSMC target libraries.
REFERENCES
[1] N. K. Bose, “Applied Multidimensional System Theory,” Van Nostrand Reinhold, New York, 1982. T. Kaczorek, “Two-Dimensional
Linear Systems,” Springer- Verlag, Berlin, 1985.
[2] W.-S. Lu and A. Antoniou, “Two-Dimensional Digital Filters,” Marcel Dekker, Electrical Engineering and Elec-tronics, Vol. 80, New
York, 1992.
[3] R. N. Bracewell, “Two-Dimensional Imaging,” Prentice- Hall Signal Processing Series, Prentice-Hall, Englewood Cliffs, 1995.
[4] Roesser R. P., “A discrete state-space model for linear image processing,” IEEE Trans. Automat. Control, vol. 20, pp. 1-10, 1975.
[5] E. Fornasini and G. Marchesini, “State-space realization theory of two-dimensional filters”, IEEE Transactions on Automatic Control,
Year: 1976, Volume: 21, Issue: 4, Pages: 484 – 492, DOI: 10.1109/ TAC. 1976.1101305.
[6] Manish Tiwari and Amit Dhawan, “A survey on stability of 2-D discrete systems described by Fornasini-Marchesini first model”,
Power, Control and Embedded Systems (ICPCES), 2010 International Conference on Year: 2010, Pages: 1 - 4, DOI: 10.1109/ ICPCES.
2010.5698674.
[7] E. Fornasini, G. Marchesini, “Doubly indexed dynamical systems: state-space models and structural properties,” Math. Syst. Theory,
vol. 12, pp. 59-72, 1978.
[8] Manish Tiwari and Amit Dhawan, “A survey on stability of 2-D discrete systems described by Fornasini-Marchesini second model”,
Circuits and Systems, Vol. 3 No. 1, 2012, pp. 17-22. doi: 10.4236/cs. 2012.31003.
[9] JERZY E. KUREK, “The General State-Space Model for a Two-Dimensional Linear Digital System”, IEEE Transactions on
Automatic Control, Year: 1985, Volume: 30, Issue: 6, Pages: 600 - 602, DOI: 10.1109/ TAC. 1985.1103998.
[10] J. Y. Zhang; W. Steenaart, “ VLSI implementation of high speed two-dimensional state-space recursive filtering”, Circuits and
Systems, 1989., IEEE International Symposium on year 1989, Pages: 1099 - 1102 vol.2, DOI: 10.1109/ISCAS.1989.100544.
[11] M. R. AZIMI-SADJADI; A. R. ROSTAMPOUR, “Parallel and Pipeline Architectures for2-D Block Processing”, IEEE Transactions
on Circuits and Systems, Year: 1989, Volume: 36, Issue: 3, Pages: 443 - 448, DOI: 10.1109/31.17593.
[12] J. Y. Zhang; W. Steenaart, High speed architectures for two-dimensional state-space recursive filtering, IEEE Transactions on Circuits
and Systems, Year: 1990, Volume: 37, Issue: 6, Pages: 831- 836 DOI: 10.1109/ 31.55044.
[13] Y. Iwata; M. Kawamata; T. Higuchi, “Design of fine grain VLSI array processor for real time 2D digital filtering”, Circuits and
Systems, 1993., ISCAS '93, 1993 IEEE International Symposium on Year: 1993,Pages: 1559 - 1562 vol. 3, DOI: 10.1109/ ISCAS.
1993 .394034.
AUTHOR PROFILE
Prabhat Chandra Shrivastava received the first class degrees of B.E. (Honours) in
Electronics & Instrumentation Engineering from Rajiv Gandhi Technical University,
Bhopal in 2008 and M. Tech in Electronics & Communication Engineering from Motilal
Nehru National Institute of Technology (MNNIT) Allahabad, India in 2010. Currently,
he is a Senior Research Fellow in Electronics and Communication Engineering, MNNIT
Allahabad, India. His research interests include the digital signal processing, Digital
filter, State Space Filtering architecture design.
Prashant Kumar received the first class degrees of B. Tech in Electronics &
Communication Engineering from SMVDU, Jammu & Kashmir, in 2012 and M. Tech
in Electronics & Communication Engineering from Motilal Nehru National Institute of
Technology (MNNIT) Allahabad, India in 2015. Currently, he is a Junior Research
Fellow in Electronics and Communication Engineering, MNNIT Allahabad, India. His
research interests include the digital signal processing and VLSI architecture design.