Variation-Aware Low-Power Synthesis Methodology For Fixed-Point FIR Filters
Variation-Aware Low-Power Synthesis Methodology For Fixed-Point FIR Filters
Variation-Aware Low-Power Synthesis Methodology For Fixed-Point FIR Filters
1, JANUARY 2009 87
shown in Fig. 2(b), the FAL value of three determines the filter
throughput. It is interesting to note that as we relax the number
of adders in the critical path beyond the MAL value, CSE can
be utilized more effectively to reduce the number of adders.
For example, if F AL = 2, as shown in Fig. 2, the common
subexpression of (X + X 2) can be shared between F1 and
F2 . On the other hand, F AL = 3 allows for more sharing
(X + X 2 + X 3). In general, a larger FAL increases
the possibility of sharing hardware for coefficients, significantly
reducing the power/area overhead.
rently, in the FIR filter design. We explain this with the help of
an example. Let us consider a transposed-form FIR filter where
the filter output is computed as
A. Problem Formulation
Given a set of fixed-point coefficients c0 −cn (that have been
divided into k subsets {S1 , S2 , . . . , Sk } ) and the levels of
computation {L1 , L2 , . . . , Lk } corresponding to each subset
(S1 , S2 , etc.), find the minimum number of adders required to
implement the filter. In this problem, S1 is a subset containing
the “most important coefficients,” and Sk contains the “least
important ones.” As defined before, the number of computation
levels for each subset (L1 for S1 , etc.) denotes the maximum
number of adders allowed in the critical path for any coeffi-
cients belonging to the corresponding subset. The computation
levels {L1 , L2 , . . . , Lk } are user-defined constraints, and Lk ,
the level for the least important subset Sk , is equal to the
FAL. The individual level constraints should be satisfied, which
implies that for each Li , Li ≥ max{M AL} over a respective
set Si .
B. LCCSE Algorithm
As mentioned earlier, our LCCSE algorithm utilizes the
CSE method in [7], which considers not only resource sharing
among the filter coefficients but also the length of the critical
path in the multiplier block. The proposed LCCSE algorithm
also takes into consideration the level constraints of each coef-
ficient based on its sensitivity. The difference of LCCSE from
the previous CSE method [7] is the fact that different level
Fig. 5. (a) Synthesis of a FIR filter using conventional CSE. The important
computations with longer delays might not be computed under process variation constraints can be given for each coefficient so that we can
resulting in low filter quality. (b) Proposed design methodology where impor- assert tighter timing bounds for more important coefficients.
tant computations constrained by “intelligent” CSE procedure. Under process Only in one case that a single level constraint (FAL) is specified
variation and voltage scaling, high filter quality is maintained.
for all coefficients, LCCSE yields results identical to that of the
the description of the algorithm, there are several points that conventional CSE. Before going into the details of the proposed
should be considered for our design methodology. LCCSE technique, we define some notations used to design the
1) Our technique is generic and applicable to any constant algorithm (reintroduced from [7]).
coefficient FIR filter design (low-pass, high-pass, and 1) Decomposed set (DS): the set of absolute values of CSD
bandpass). numbers that have been decomposed as a sum of other
2) Our LCCSE algorithm utilizes the CSE method proposed CSD numbers. For example, 101001 (25) can be de-
in [7]. While most CSE algorithms in the literature are composed as (101 3) + 1 or 100001 − (1 3)(25 =
focused on minimizing the number of adders without con- 3 × 23 + 1 = 33 − 1 × 23 ).
sidering the delay of the critical path, the CSE algorithm 2) Undecomposed set (UDS): the set of absolute values
in [7] can constrain the number of adders in the critical of CSD numbers waiting to be decomposed with other
path. Hence, this CSE method can perform tradeoff CSD numbers. Initially, {U DS} contains all the filter
between the critical path delay (ALs) and the hardware coefficients.
overhead (the number of adders) in a direct way, which is 3) All possible combination set (APCS): the set of candidate
an essential property for our design methodology. CSD numbers which can be utilized to decompose CSD
3) We might not get the same number of adders obtained numbers in {U DS}. This set is constructed in the fol-
by the standard CSE technique [7] in our LCCSE im- lowing manner. All the possible combinations of nonzero
plementation. While in the standard CSE implementation terms of a coefficient are extracted, and the extracted CSD
only the maximum levels or the maximum number of numbers are continuously right-shifted until they become
adders in the critical path would be specified, in LCCSE, odd. The absolute values of these numbers (except the
we constrain coefficients to certain levels based on their value of one) are added to {AP CS}. For instance, from
sensitivities. As explained in Section II, this might result a coefficient “101001” (25), we can extract six CSD num-
in less sharing opportunities and increase the number bers {100001, 101000, 1001, 100000, 1000, 1}, which
of required adders compared with the standard CSE are right-shifted to become {100001, 101, 1001, 1, 1, 1},
implementation. However, filters designed with LCCSE respectively. Finally, their absolute values, except one, are
92 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 28, NO. 1, JANUARY 2009
TABLE II
MAXIMUM RIPPLE FOR 25-TAP FILTER. CASE D (3/4/5)
VERSUS CASE B (CONV.)
TABLE III
POWER FOR NOMINAL/SCALED Vdd . CASE B (CONV.)/D (PROP.)
TABLE V
POWER FOR NOMINAL/SCALED Vdd FOR 121/32-TAP FILTER
low-energy requirements and tolerance to large process vari- [22] Y. C. Lim and S. R. Parker, “Discrete coefficient FIR digital filter design
ations while maintaining a reasonably accurate filter response. based upon an LMS criteria,” IEEE Trans. Circuits Syst., vol. CAS-30,
no. 10, pp. 723–739, Oct. 1983.
This is achieved by restricting the “important filter coefficients” [23] J. G. Proakis and D. G. Manolakis, Digital Signal Processing: Princi-
to a less number of computation steps than the maximum ples, Algorithms and Applications. Englewood Cliffs, NJ: Prentice–Hall,
allowed in a CSE-based filter implementation. The later com- 1996.
putation stages are allowed to compute only the “less contribu-
tive coefficients.” Under Vdd overscaling, conventional FIR Jung Hwan Choi (S’07) received the B.S. and M.S.
architectures fail to provide the desired filter response. On the degrees in electrical engineering from Seoul Na-
other hand, as the voltage scales down, our filter architectures tional University, Seoul, Korea, in 1998 and 2000, re-
spectively. He is currently working toward the Ph.D.
provide a very graceful degradation in the output response. This degree in electrical and computer engineering in
enables us to operate these filters at low voltages to save power the School of Electrical and Computer Engineering,
consumption under minor quality degradation. We believe that Purdue University, West Lafayette, IN.
In the summer of 2006 and 2007, he was with
the proposed concept can be applicable to other areas of signal Intel Corporation, Austin, TX, as an Intern. His
processing where a proper tradeoff between power and quality research interests include low-power DSP circuit de-
of service is required. sign, statistical design methodologies under process
variation, and thermal modeling and analysis.
R EFERENCES
[1] K. Parhi, VLSI Digital Signal Processing Systems: Design and Implemen- Nilanjan Banerjee received the B.S. degree in
tation. New York: Wiley, 1999. electrical engineering from Jadavpur University,
[2] D. R. Bull and D. H. Horrcks, “Primitive operator digital filters,” Proc. Calcutta, India, the M.S. degree from Arizona
Inst. Elect. Eng.—Circuits Devices Syst., vol. 138, no. 3, pp. 401–412, State University, Tempe, and the Ph.D. degree from
Jun. 1991. Purdue University, West Lafayette, IN, in 2008.
[3] A. G. Dempster and M. D. Macleod, “Use of minimum-adder multiplier He was a Software Design Engineer with Infosys
blocks in FIR digital filters,” IEEE Trans. Circuits Syst. II, Analog Digit. Technologies, Bangalore, India. He is currently a Re-
Signal Process., vol. 42, no. 9, pp. 569–577, Sep. 1995. search Scientist with the Microprocessor Research
[4] H.-J. Kang and I.-C. Park, “FIR filter synthesis algorithms for minimizing Laboratory, Intel, Santa Clara, CA. His research
the delay and the number of adders,” IEEE Trans. Circuits Syst. II, Analog interests include developing low-power and error-
Digit. Signal Process., vol. 48, no. 8, pp. 770–777, Aug. 2001. resilient circuits and systems.
[5] A. Dempster et al., “Designing multiplier blocks with low logic depth,” Dr. Banerjee was the recipient of academic excellence awards in 1999 and
in Proc. ISCAS, May 2002, vol. 5, pp. 773–776. 2002, and the Intel Fellowship and Henry Ford Scholarship Award in 2007.
[6] Y. Takahashi and M. Yokoyama, “New cost-effective VLSI implemen-
tation of multiplierless FIR filter using common subexpression elimina-
tion,” in Proc. ISCAS, May 2005, vol. 2, pp. 1445–1448. Kaushik Roy (S’83–M’83–SM’95–F’02) received
[7] C. Yao et al., “A novel common-subexpression-elimination method for the B.Tech. degree in electronics and electrical com-
synthesizing fixed-point FIR filters,” IEEE Trans. Circuits Syst. I, Reg. munications engineering from Indian Institute of
Papers, vol. 51, no. 11, pp. 2211–2215, Nov. 2004. Technology, Kharagpur, India, and the Ph.D. degree
[8] A. Hosangadi et al., “Algebraic methods for optimizing constant multipli- from the University of Illinois, Urbana, in 1990.
cations in linear systems,” J. VLSI Signal Process. Syst., vol. 49, no. 1, He was with the Semiconductor Process and De-
pp. 31–50, Oct. 2007. sign Center, Texas Instruments Incorporated, Dallas,
[9] O. Gustafsson and L. Wanhammar, “ILP modelling of the common sub- TX, where he worked on field-programmable gate
expression sharing problem,” in Proc. 9th IEEE ICECS, Dec. 2002, vol. 3, array (FPGA) architecture development and low-
pp. 1171–1174. power circuit design. Since 1993, he has been with
[10] S. Vijay et al., “A greedy common subexpression elimination algorithm the School of Electrical and Computer Engineering,
for implementing FIR filters,” in Proc. ISCAS, May 2007, pp. 3451–3454. Purdue University, West Lafayette, IN, where he is currently a Professor and a
[11] R. M. Hewlitt and E. S. Swartzlander, “Canonical signed digit repre- University Faculty Scholar and holds the Roscoe H. George Chair of Electrical
sentation for FIR digital filters,” in Proc. IEEE WorkShop SiPS, 2000, and Computer Engineering. He has published more than 400 papers in refereed
pp. 416–426. journals and conferences. He is the coauthor of two books, namely, Low-
[12] H. Shaffeu et al., “Improved design procedure for multiplierless FIR Power CMOS VLSI Circuit Design (New York, NY: Wiley, 2000) and Low
digital filters,” Electron. Lett., vol. 27, no. 13, pp. 1142–1144, Jun. 1991. Voltage, Low Power VLSI Subsystems (New York, NY; McGraw Hill, 2004).
[13] D. Ait-Boudaoud and R. Cemes, “Modified sensitivity criterion for the He is the holder of eight patents. His research interests include very large
design of powers-of-two FIR filters,” Electron. Lett., vol. 29, no. 16, scale integration (VLSI) design/computer-aided design for nanoscale silicon
pp. 1467–1469, Aug. 1993. and nonsilicon technologies, low-power electronics for portable computing and
[14] Z. Ye and C.-H. Chang, “Local search method for FIR filter coef- wireless communications, VLSI testing and verification, and reconfigurable
ficients synthesis,” in Proc. 2nd IEEE Int. Workshop DELTA, Jan. 2004, computing.
pp. 255–260. Dr. Roy has been in the editorial board of IEEE DESIGN AND TEST, IEEE
[15] S. Borkar, “Designing reliable systems from unreliable components: The TRANSACTIONS ON CIRCUITS AND SYSTEMS, and IEEE TRANSACTIONS ON
challenges of transistor variability and degradation,” IEEE Micro, vol. 25, VERY LARGE SCALE INTEGRATION SYSTEMS. He was a Guest Editor for a
no. 6, pp. 10–16, Nov./Dec. 2005. Special Issue on Low-Power VLSI in the IEEE DESIGN AND TEST, in 1994; the
[16] K. J. Kuhn, “Reducing variation in advanced logic technologies: Ap- IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION SYSTEMS, in
proaches to process and design for manufacturability of nanoscale June 2000, and the Institution of Electrical Engineers Proceedings—Computers
CMOS,” in IEDM Tech. Dig., Dec. 2007, pp. 471–474. and Digital Techniques in July 2002. He was the recipient of the National
[17] K. Bernstein et al., “High-performance CMOS variability in the 65-nm Science Foundation Career Development Award in 1995; IBM Faculty
regime and beyond,” IBM J. Res. Develop., vol. 50, no. 4/5, pp. 433–449, Partnership Award; AT&T/Lucent Foundation Award; 2005 SRC Technical
Jul. 2006. Excellence Award; Semiconductor Research Corporation Inventors Award;
[18] C. H. Kim et al., “On-die CMOS leakage current sensor for measuring 2005 IEEE Circuits and System Society Outstanding Young Author Award
process variation in sub-90 nm generations,” in VLSI Symp. Tech. Dig., (Chris Kim); 2006 IEEE TRANSACTIONS ON VERY LARGE SCALE
Jun. 2004, pp. 250–251. INTEGRATION SYSTEMS Best Paper Award; and the Best Paper Awards at the
[19] Synopsys Inc., Synopsys Design Compiler. 1997 International Test Conference, 2000 IEEE International Symposium on
[20] Synopsys Inc., Nanosim. Quality of IC Design, 2003 IEEE Latin American Test Workshop, 2003 IEEE
[21] Predictive Technology Model. [Online]. Available: http://www.eas.asu. Nano, 2004 IEEE International Conference on Computer Design, and 2006
edu/~ptm IEEE/ACM International Symposium on Low Power Electronics and Design.