Efficient On-Chip Crosstalk Avoidance CODEC Design
Efficient On-Chip Crosstalk Avoidance CODEC Design
Efficient On-Chip Crosstalk Avoidance CODEC Design
Authorized licensed use limited to: Vardhaman College of Engineering. Downloaded on September 6, 2009 at 08:38 from IEEE Xplore. Restrictions apply.
552 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 4, APRIL 2009
It has been discovered relatively recently that encoding the • We show that the CODEC gate count grows quadratically
bus can eliminate some classes of data patterns with much lower with bus size as opposed to the exponential growth for the
area overhead compared to the shielding techniques [5], [6]. existing approaches.
These codes are commonly referred to as crosstalk avoidance The remainder of this paper is organized as follows. Section II
codes (CACs). CACs can be further divided into two categories: first provides some background on delay and power analysis
memory-less and memory-based. The memory-based coding of the bus in the presence of crosstalk. The classification of
approaches generate a codeword based on the previously crosstalk is given in Section II-A. In Section II-C, the for-
transmitted code and the current dataword to be transmitted bidden-pattern-free cross avoidance code is defined and its
[6], [9]. On the receiver side, the data is recovered based on performance is discussed, including codeword generation and
the received codewords from the current and previous cycles. overhead computation. A lower bound of the area overhead
The memory-less coding approaches use a fixed code book to is established. Section III focuses on the construction of the
generate a codeword to transmit, solely based on the input data. CODEC for FPF-CAC. We give the mathematical basis for the
The corresponding receiver decoder uses the current received CODEC construction and discuss the overhead performance
codeword as the only input to recover the data. of different CODECs. In Section IV, we investigate the circuit
The theoretical lower bound of the area overhead for implementation details of the proposed CODECs. Experimental
memory-based codes is lower compared to memory-less codes. results are also presented. Conclusions are drawn in Section V.
However, the memory-based CODECs are much more com-
plex and the only known codeword generation method is an A. Notation
exhaustive search and pruning-based method. For clarity, throughout this paper, unless specified otherwise,
Several different types of memory-less CACs have been pro- an bit bus is represented by a vector , with
posed. The code designs are discussed in [3]–[6]. These codes being the most significant bit and the least significant bit.
offer the same degree of delay reduction as the passive shielding
technique, with much less area overhead (ranging from 44% to II. FORBIDDEN PATTERN-BASED CAC
62.5%). Unfortunately, none of the referred papers addresses the A. Crosstalk Classification
mapping between datawords and codewords for the CODECs.
As stated in Section I, the degree of crosstalk in an on-chip
So far, all the CODEC design approaches are based on bus par-
bus is dependent on data transition patterns on the bus. Based
titioning (which breaks a big bus into a number of small groups
on the model shown in Fig. 1, the delay of the th wire in a
(lanes) and applies CAC coding on each group independently).
data bus is given as [1]
Such an approach has to deal with the crosstalk across the group
boundaries. Several different schemes are proposed to handle
this inter-group crosstalk, such as group inversion and bit over- (1)
lapping [4], [5]. In all cases, more wires are needed and there-
fore the overall area overhead is higher than the theoretical lower where is a constant determined by the driver strength and
bound. wire resistance, is the voltage change on the th line and
In this paper, we offer a systematic CODEC construction is the relative voltage change between the
solution for the forbidden-pattern-free crosstalk avoidance code th and th line. Since on-chip busses are generally full-swing
(FPF-CAC). The mapping scheme we propose is based on the binary busses, we can assume that the two output voltage levels
representation of numbers in the Fibonacci numeral system. We are and 0 V and hence and
show that all datawords can be represented in the Fibonacci-based . If we let , (1) can be rewritten
numeral system with FPF vectors. We propose several different as
coding schemes that allow the CODECs to be constructed for
any arbitrary bus size. With such a systematic mapping, the (2)
CODEC for a wider bus is constructed by a simple extension Here is the normalized voltage change on th line.
of the CODEC for a smaller bus. The first CODEC proposed is the normalized relative voltage change
in the paper is proven to have near-optimal area overhead on th line (relative to the th or th line). The term
performance. We further offer an improved coding scheme corresponds to the intrinsic delay and the remaining two terms
that achieves optimal overhead performance. We also propose correspond to the crosstalk induced delay. Since , the first
modifications to our near-optimal CODEC that will reduce term has negligible contribution to the delay.
the complexity and improve the delay performance of the If we define as the effective total capacitance of the
CODEC. driver of th line, we have
The key contributions of this paper include the following.
• We define a deterministic mapping scheme for the (3)
FPF-CAC-based on the Fibonacci-based binary numeral
system. and
• Based on the mapping scheme, we propose coding algo- (4)
rithms that allow systematic CODEC constructions so that
the CODEC for a wider bus is obtained as an extension of From (4), we get and
the CODEC for smaller bus. depending on the transition pattern on the wire
Authorized licensed use limited to: Vardhaman College of Engineering. Downloaded on September 6, 2009 at 08:38 from IEEE Xplore. Restrictions apply.
DUAN et al.: EFFICIENT ON-CHIP CROSSTALK AVOIDANCE CODEC DESIGN 553
TABLE I TABLE II
CLASSES OF CROSSTALK FPF-CAC CODEWORDS FOR 2-, 3-, 4-, AND 5 BIT BUSSES
(5)
Table II lists the codewords of the 3-, 4-, and 5-bit FPF-CACs
generated by Algorithm 1.
where is the effective capacitance on the th wire defined From Algorithm 1, we can see that for each bit code-
earlier. word with last two digits , two
Equation (5) shows that the bus energy is the summation of -bit codewords can be generated. For with the last two
the energy consumption of each given bit and that the crosstalk digits , only one -bit codeword can be gen-
also has effect on energy consumption. Therefore, avoiding erated. The total number of FPF-CAC codewords can be com-
crosstalk could result in reduction of the overall energy con- puted based on the following equations.
sumption of a bus as we show later. Definition 1: For an -bit vector , we define
the following quantities:
C. Forbidden Pattern Based Crosstalk Avoidance • is the total number of distinct -bit vectors;
• is the total number of FPF vectors;
The forbidden pattern-based crosstalk avoidance code was • is the total number of non FPF vectors;
first proposed in [5]. The forbidden patterns are defined as 3-bit • is the number of FPF vectors satisfy ;
patterns “101” and “010”. A code is forbidden pattern free • is the number of FPF vectors satisfy .
(FPF) if there is no forbidden pattern in any three consecu- For the base case, a 3-bit bus :
tive bits. As examples, is not forbidden pattern free; • ;
1100110 is FPF. It has been shown in [5] that for a code that • ;
contains only FPF codewords, the bus that transmits only these • ;
codewords will experience maximum crosstalk of no greater • .
than . Therefore, by encoding the datawords to FPF code- For busses with more than 3-bits :
words, we can speed up the bus by 100%. This type of code
is referred herein as FPF-CAC. (6)
The FPF-CAC can be generated using an inductive proce-
dure [5]. Let be the set of -bit FPF-CAC codewords, an (7)
-bit vector is a codeword. Any code- (8)
Authorized licensed use limited to: Vardhaman College of Engineering. Downloaded on September 6, 2009 at 08:38 from IEEE Xplore. Restrictions apply.
554 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 4, APRIL 2009
Based on Algorithm I and the definitions of , , and , the data bus width increases, the CODEC size grows exponen-
we get tially. Fig. 2 shows the number of two-input gates required for
CODECs of data bus widths varying from 3 to 12.1 The total
(9) number of mapping permutations also grows rapidly. Even for
(10) a small 4- to 5-bit CAC encoder, there are over 2 possible
mapping permutations. In addition, the CAC codes are non-
Equation (6) can be rewritten as linear and therefore it is difficult to extend a mapping scheme
for smaller busses to larger busses. Several different schemes
have been proposed for CODEC construction for FPF-CAC or
other memory-less CACs [4]–[6]. These schemes are all based
(11) on bus partitioning, which breaks up a wide bus into smaller
groups or lanes (typically 3–5 bits) and exhaustively searches
The relationship shown in (11) is the same as the relationship
for an optimal mapping that yields the most efficient CODEC
of elements in the Fibonacci sequence. (A more detailed dis-
for the groups. Unfortunately, in order to handle crosstalk across
cussion about Fibonacci sequence will be given in Section III.)
the group boundaries, these schemes all inevitably suffer from
With the initial conditions and ,
additional area overhead.
we have
In this section, we propose two coding schemes that allow
(12) us to encode data to the FPF-CAC without partitioning the bus.
These coding schemes allow us to systematically construct the
where is the th element in the Fibonacci sequence. FPF-CAC CODECs for busses of arbitrary size. By “systemat-
gives the maximum cardinality of the -bit FPF-CAC ically,” we mean that the CODEC for a larger size bus is ex-
code. To encode an -bit binary bus into FPF-CAC code, the tended from the CODEC of a smaller bus. The gate counts of
minimum number of bits needed is the smallest integer the proposed CODEC implementation roughly grow quadrati-
that satisfies (13). We can also compute the lower bound of the cally with respect to the bus size, instead of exponentially for
area overhead , which is defined as the ratio between previous approaches [4]. Both our schemes are based on the Fi-
the additional area required for the coded bus and the area of bonacci numeral system.
uncoded bus
A. Fibonacci-Based Binary Numeral System
(13)
The Fibonacci binary numeral system was first used in CAC
(14) designs in [3] for crosstalk avoidance coding. The paper pro-
posed an inductive codeword generations algorithm for a type
For the Fibonacci sequence, , of CAC called self-shielding code.2 The inductive algorithm is
also known as the golden ratio, is the asymptotic ratio of two similar to the ones proposed in [5] and [6]. However, none of
consecutive elements of the sequence [12]. Hence, these papers address the mapping scheme and CODEC designs
, where is a constant. Therefore, for large busses, the lower as we do in the rest of this section.
bound of the overhead is A numeral system is “a framework where numbers are
represented by numerals in a consistent manner” [13]. The
(15) most commonly used numeral system in digital design is the
binary numeral system, which uses powers of two as the base.
A number’s binary representation is defined in (16). The binary
III. FPF-CAC CODEC DESIGN numeral system is complete and unambiguous, which means
As discussed in Section II, the 3C and 4C crosstalk classes that each number has one and only one representation in the
can be avoided if the bus is encoded using the FPF code. We binary numeral system.
provided the recursive procedure for generating the codewords Definition 2:
and showed how to compute the total number of codewords and
the lower bound for the area overhead. However, the mapping (16)
scheme between the input datawords and the output codewords
was not discussed, nor was it shown how a CODEC for the
(17)
FPF-CAC can be constructed.
Conceptually, the mapping between the datawords and the
codewords is flexible, provided it can be reversed by the de- The Fibonacci-based numeral system is the
coder. In the case when the size of the code book is not a power numeral system that uses Fibonacci sequence as the base. The
of two, a 1-to-1 mapping is not required. A 1-to-many mapping definition of the Fibonacci sequence is given in (18). A number
for certain datawords may reduce the CODEC complexity fur- is represented as the summation of some Fibonacci numbers
ther. 1Even though the CODEC is for a slightly different crosstalk avoidance code,
When the data bus width is small, the CODEC can be im- we feel that the results can be used as a benchmark.
plemented and the mapping flexibility can be exploited to op- 2In some literatures, this type of code is also called forbidden transition free
timize the speed and/or the area of the CODEC. However, as code
Authorized licensed use limited to: Vardhaman College of Engineering. Downloaded on September 6, 2009 at 08:38 from IEEE Xplore. Restrictions apply.
DUAN et al.: EFFICIENT ON-CHIP CROSSTALK AVOIDANCE CODEC DESIGN 555
Authorized licensed use limited to: Vardhaman College of Engineering. Downloaded on September 6, 2009 at 08:38 from IEEE Xplore. Restrictions apply.
556 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 4, APRIL 2009
FPF-CAC( )
MSB stage:
if then
;
;
else
;
;
end if
other stages:
for do
if then
;
else if then
; Fig. 3. CODEC structure (based on Algorithm 2).
else
;
Proof:
end if
• If and [ in force-1 zone]
;
and ;
end for
;
LSB
[ not in force-0 zone]
;
;
return ( );
• If and [in force-0 zone]
and ;
Algorithm 2 shows that an -bit FPF vector is generated in [ not in force-1 zone];
stages. Each stage outputs one bit of the output vector ( ) .
and the remainder ( ) that is the input to the following stage. In Table III the complete 6-bit codewords generated using
In the th stage, the input is compared to two Fibonacci Algorithm 2 are listed as CODE-1. As stated earlier, the MSB
numbers and . If , is coded as 1; If stage is different from other stages since there is no preceding
, is coded as 0; If the value is in between, bit and, for the values in the gray zone, can be coded to
is coded to the same value as . The remainder is computed be either value. In Algorithm 2, we arbitrarily choose to code
accordingly based on the value of . We shall refer the ranges the MSB ( ) to be 0 when the input value is in the gray zone.
, , and as the force-1 zone, gray If we code to be 1 for all values in the gray zone, we end
zone, and force-0 zone of the th stage, respectively. The most up with a different set of codewords as listed in CODE-2 in the
significant bit (MSB) stage is slightly different from other stages table. All codewords in both CODE-1 and CODE-2 are FPF.
since no bit proceeds it. It encodes by comparing the input For clarity, we only list codewords in CODE-2 that are different
with only one Fibonacci number . from codewords in CODE-1.
The decoder is a straightforward implementation of (17), Based on (19), we can easily see that the total numbers of
which converts the Fibonacci vector back to the binary vector. codewords in both CODE 1 and CODE-2 are , slightly
Fig. 3 shows the encoder and decoder structures based on smaller than the maximum cardinality of given in (14).
Algorithm 2. Since , we know for a given size
The correctness of Algorithm 2 can be proven by showing input data vector , the number of bits needed for the proposed
that if after the th stage, the partially generated output vector CODEC is no more than 1 bit more than the minimum number
is FPF, adding the output of the th stage, of bits required . Table IV lists the number of bits needed to
will not introduce a forbidden pattern. encode the binary data from 3 to 32 bits: denotes the number
We first recognize that if , no forbidden pattern of bits for the input binary bus; the number of bits required
will be produced regardless of the value of . Therefore, for the optimal code; the number of bits needed for the
we only need to show that when , will satisfy proposed CODEC and the difference between the two.
. Based on Algorithm 2, occurs only
when is forced to be 0 or 1. The proof is reduced to proving C. Optimal CODEC
that when is forced to one particular value, will be A quick examination shows that ALL the valid FPF-CAC
coded the same value as . The proof is given as follows. codewords are actually listed in Table III. There are a total
Authorized licensed use limited to: Vardhaman College of Engineering. Downloaded on September 6, 2009 at 08:38 from IEEE Xplore. Restrictions apply.
DUAN et al.: EFFICIENT ON-CHIP CROSSTALK AVOIDANCE CODEC DESIGN 557
TABLE IV
OVERHEAD COMPARISON
if
(22)
otherwise.
Authorized licensed use limited to: Vardhaman College of Engineering. Downloaded on September 6, 2009 at 08:38 from IEEE Xplore. Restrictions apply.
558 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 4, APRIL 2009
Fig. 6. Resource count and delay of the proposed encoder for different input
bus sizes (FPGA implementation).
(23)
TSMC 90-nm process [16] and a TSMC 130-nm process). Fig. 6
plots the resource and delay of the FPGA implementation and
This modification is to simply code the output MSB bit to the
Fig. 7 plots the equivalent gate counts of the encoder for input
input MSB bit. The outputs are still FPF codes because to code
bus widths from 8 to 32 implemented using a TSMC 90-nm
a -bit binary code to an -bit Fibonacci code, and satisfy:
process [16]. For an input data width of 12 bits, the equivalent
and we have
number of two-input gates is 369. This is roughly two orders of
(24) magnitude lower than the gate count reported in Fig. 2 [4]. For
the input data width of 32-bit, the total area is 17 865 m and
Therefore, the bit input binary data can be broken into the the equivalent gate count is 2762. The sizes grow quadratically
MSB bit and a bit vector. We simply construct an encoder with the bus size, as we expected. The gate count for the 130 nm
for the bit vector. The MSB bit controls the output bit value are very close to the 90-nm process.
when the bit input value is in the gray zone. Fig. 8 compares the gate count for three different encoder im-
Fig. 5 shows the modified CODEC with the simplified MSB plementations: a single level lookup table (LUT) implementa-
stage. On the encoder side, the MSB of the input is mapped tion using random mapping, a single level LUT with Fibonacci
directly to the MSB of the output . The rest of the input vector numeral system mapping and the staged design proposed in this
becomes the input of the th stage. paper. All the designs are implemented using the same TSMC
On the decoder side, the first input of the summation stage is 90-nm process. We can see clearly that the size of the LUT-
, instead of as in Fig. 3. based designs grow exponentially. Fig. 8 shows that the map-
To evaluate the complexity of the CODECs, we implemented ping schemes affect the encoder complexity: on average, a LUT-
the near-optimal CODEC in both a field-programmable gate based encoder using the Fibonacci mapping is 50% smaller than
array (FPGA) (Xilinx XC4VLX15-12 [15]) and ASIC (both a a randomly mapped encoder. It also shows that for small busses,
Authorized licensed use limited to: Vardhaman College of Engineering. Downloaded on September 6, 2009 at 08:38 from IEEE Xplore. Restrictions apply.
DUAN et al.: EFFICIENT ON-CHIP CROSSTALK AVOIDANCE CODEC DESIGN 559
TABLE VI
POWER CONSUMPTION COMPARISON BETWEEN CODED AND UNCODED BUSSES
Fig. 8. Gate count comparison for different encoder implementations in a overhead can be balanced to achieve the required speed while
TSMC 90-nm process: a single level LUT-based implementation using Fi- minimizing the additional area overhead.
bonacci mapping, a single level LUT using random mapping and the proposed The FPF code is originally proposed to improve the bus speed.
staged encoder.
However, our simulations show that by applying the FPF en-
coding, the bus energy consumption can be reduced as well. We
the proposed design is not advantageous compared with LUT- randomly generated 10 000 input vectors for 8-, 12-, 16-, and
based design. However, for busses with 8 or more bits, the pro- 32-bit data and transmitted these randomly generated data on an
posed staged designs offer significant savings in terms of gate uncoded bus and a coded bus, respectively. For each bus width, a
count. normalized total energy consumption, is computed based
Figs. 6 and 7 also show the delays of the encoder. Understand- on (5) with and both set to 1. Table VI gives the nor-
ably, the delay increases as the input bus size goes up since the malized energy consumption for coded and uncoded busses. The
total delay is the accumulated delay of all stages. Unlike the comparison is technology independent and is also independent
single level implementation, our design allows pipeline stages of bus configurations, i.e., bus length, wire sizing, provided that
to be inserted between stages to mitigate this problem. is guaranteed. The results indicate that even with 44%
Bus partitioning can also be used to reduce the total size and more wires, coded busses have lower total energy consumption.
improve the speed of the decoder. Our experiment confirmed It is important to point out that such saving is achieved using
that the maximum input-to-output delay of an non-pipelined random sequences. For busses that transmit data with regularity,
-bit encoder is . Reducing the bus in half can the results may vary. We also would like to point out that in
improve the bus speed by approximately a factor of 4. Similarly, our simulation, we do not include the power consumption of the
the total area has the quadratic relation with the number of input CODEC.
bits and therefore partitioning the bus will reduce the total area
by 50%.
V. CONCLUSION
The decoder structure is simpler than the encoder and has no
ripple delay. However, as the bus width grows, the summation Crosstalk avoidance codes are shown to be able to reduce the
stage size goes up and more delay will be incurred. Note that inter-wire crosstalk and therefore boost the maximum speed on
there is no multiplication or AND operation in the actual imple- the data bus. They have the advantage of consuming less area
mentation. Since is a constant, it is a simple case of con- overhead than shielding techniques. Even though several dif-
necting to the nonzero bit positions of . ferent types of codes have been proposed in the past few years,
Fig. 9 illustrates a structure where an -bit input bus is split no mapping scheme was given which facilitates the CODEC
into two -bit groups. Each group is encoded and decoded implementation. Compounded by the nonlinear nature of the
independently. The maximum delay of the encoder and decoder CAC, the lack of a solution to the systematic construction of
stages are and , instead of and the CODEC has hampered the wide use of CAC in practice.
. In this paper, we give what we believe is the first solution to
The crosstalk that occurs across the boundary must be dealt this problem. We showed that data can be coded to a forbidden
with when the bus partitioning technique is employed. In Fig. 9, pattern free vector in the Fibonacci numeral system. We first
we simply duplicate the boundary lines. This requires two extra give a straightforward mapping algorithm that produces a set
wires for each added partition. Other approaches that can be ap- of FPF codes with near-optimal cardinality. The area overhead
plied to minimize the number of additional wires, such as group of this coding scheme is near the theoretical lower bound. The
inversion as proposed in [5]. One additional wire has to be used CODEC based on this coding scheme is systematic and has very
as an inversion indication. The tradeoff between area/speed and low complexity. The size of the CODEC grows quadratically
Authorized licensed use limited to: Vardhaman College of Engineering. Downloaded on September 6, 2009 at 08:38 from IEEE Xplore. Restrictions apply.
560 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 4, APRIL 2009
with the data bus size as opposed to exponentially in a brute [11] H. Kaul, D. Sylvester, and D. Blauuw, “Active shielding of RLC global
forced implementation. Our systemic coding scheme allows the interconnects,” in Proc. 8th ACM/IEEE Int. Workshop Timing Issues
Specification Synth. Digit. Syst., 2002, pp. 98–104.
code design of arbitrarily large busses without having to resort [12] Wikipedia, “Fibonacci number,” 2007. [Online]. Available: http://en.
to bus partitioning. wikipedia.org/wiki/ Fibonacci_number
We further proposed an improved coding scheme which [13] Wikipedia, “Numeral system,” 2007. [Online]. Available: http://en.
wikipedia.org/wiki/ Numeral_system
yield a set of FPF codes with maximum cardinality. The area [14] P. K. Saraswat, G. Haghani, and A. K. Bernard, “A low power design
overhead of this optimal coding scheme matches the theoretical of gray and T0 codecs for the address bus encoding for system level
lower bound. We gave the corresponding modification in the power optimization,” 2005. [Online]. Available: www.studentimaster.
usilu.net/ saraswap/prabhat/projects/
CODEC design as well. [15] Xilinx, San Jose, CA, “Xilinx Virtex4 family datasheet,” 2007.
This paper also discusses issues associated with CODEC [Online]. Available: http://www.xilinx.com/ products/silicon_solu-
implementations. We proposed a modified coding scheme that tions/fpgas/virtex/virtex4/index.htm
[16] Taiwan Semiconductor Manufacturing Company, Hsinchu, Taiwan,
eliminates the MSB stage in the encoder and simplifies the “TSMC 90 nm process,” 2007. [Online]. Available: http://www.tsmc.
decoder side as well. The modification reduces the total gate com/english/b_technology /b01_platform/b010101_90nm.htm
count and improves the CODEC speed.
Chunjie Duan (M’04) received the B.Sc. degree
We implemented the near-optimal CODEC in both an FPGA from Tsinghua University, Beijing, China, and the
and a 90-nm ASIC process. The reported results show that the M.Sc. degree from Colorado State University, Fort
size of our CODEC is several orders of magnitude lower than a Collins, and the Ph.D. degree from University of
Colorado, Boulder, all in electrical engineering.
previously reported design for a 12-bit bus. We also investigated He worked as a System Engineer with Alcatel
the possibility of combining our approach with bus partitioning from 1993 to 1996. He worked with Qualcomm Inc.
in very large busses to address the propagation delay issue as and later Ericsson Wireless Communications as a
Senior Engineer on CDMA system/hardware designs
well as to reduce the total size of the CODEC. from 1998 to 2004. Since 2004, he has been with
We compared the average bus energy consumption of un- Mitsubishi Electric Research Labs, Cambridge, MA,
coded and FPF coded busses in simulation. Our experimental where he is currently a Principal Technical Staff. His research interests include
digital communications and networking including wireless, satellite and optical
results show that FPF coding offers on average 20% power communications, signal processing, and high speed, low power VLSI designs.
saving.
Even though this work is strictly limited to one class of
Victor H. Cordero Calle received the B.Sc. degree
crosstalk avoidance code (the FPF-CAC), we believe that the from Pontificia Universidad Catolica del Peru, Peru,
approach can be easily adapted to other varieties of crosstalk in 2002, and the M.Sc. degree from the University of
avoidance codes as well. Idaho, Boise, in 2006, both in electrical engineering.
He is currently pursuing the Ph.D. degree in com-
puter engineering from Texas A&M University, Col-
lege Station.
During the M.Sc., he participated on the design
REFERENCES of a field programmable processor array for NASA
while working for the Center of Advanced Micro-
[1] P. Sotiriadis and A. Chandrakasan, “Low power bus coding techniques electronics and Biomolecular Research. He currently
considering inter-wire capacitance,” in Proc. IEEE-CICC, 2000, pp. studies radiation detection VLSI, global clock distribution using standing waves,
507–510. sub-threshold VLSI and variability computational complexity reduction. During
[2] K. Kim, K. Baek, N. Shanbhag, C. Liu, and S.-M. Kang, “Coupling the Fall of 2007, he interned with Advanced Micro Devices Inc. (AMD), the
driven signal encoding scheme for low-power interface design,” in Global Circuit Design Group and with IBM Watson Research Labs during the
Proc. IEEE/ACM Int. Conf. Comput.-Aided Des., Nov. 2000, pp. Summer 2008 with the Scalable Server Networks and Memory Systems Group.
318–321.
[3] M. Mutyam, “Preventing crosstalk delay using Fibonacci representa-
tion,” in Proc. Int. Conf. VLSI Des., 2004, pp. 685–688. Sunil P. Khatri (M’86) received the B.Tech. (EE)
[4] S. R. Sridhara, A. Ahmed, and N. R. Shanbhag, “Area and energy- degree from IIT Kanpur, Kanpur, India, the M.S. de-
efficient crosstalk avoidance codes for on-chip busses,” in Proc. ICCD, gree in electrical engineering and computers from the
2004, pp. 12–17. University of Texas, Austin, and the Ph.D. degree in
[5] C. Duan, A. Tirumala, and S. P. Khatri, “Analysis and avoidance of electrical engineering and computer science from the
cross-talk in on-chip bus,” in Proc. 9th Symp. High Perform. Intercon- University of California, Berkeley.
nects (HOTI), 2001, pp. 133–138. He worked with Motorola, Inc., for four years,
[6] B. Victor and K. Keutzer, “Bus encoding to prevent crosstalk delay,” in where he was a member of the design teams of the
Proc. ICCAD, 2001, pp. 57–63. MC88110 and PowerPC 603 RISC microprocessors.
[7] S. P. Khatri, “Cross-talk noise immune VLSI design using regular He is currently an Assistant Professor with the De-
layout fabrics,” Ph.D. dissertation, Dept. Elect. Eng. Comput. Sci., partment of Electrical Computer Engineering, Texas
Univ. California Berkeley, Berkeley, 1999. A&M University, College Station. His research interests include logic synthesis,
[8] C. Duan and S. P. Khatri, “Exploiting crosstalk to speed up on-chip novel VLSI design approaches to address issues such as power, cross-talk, and
busses,” in Proc. Conf. Des. Autom. Test Eur., 2004, pp. 778–783. radiation tolerance, as well as cross-disciplinary applications of these topics. He
[9] C. Duan, K. Gulati, and S. P. Khatri, “Memory-based cross-talk has coauthored about 120 technical publications, 5 U.S. Patents, one book, and
canceling CODECs for on-chip busses,” in Proc. ISCAS, 2006, pp. a book chapter. His research is supported by Intel Corporation, Nascentric Inc,,
1119–1123. Lawrence Livermore National Laboratories, and the National Science Founda-
[10] J. Ma and L. He, “Formulae and applications of interconnect estimation tion.
considering shield insertion and net ordering,” in Proc. ICCAD, 2001, Dr. Khatri was a recipient of two Best Paper Awards and two Best Paper
pp. 327–332. Nominations.
Authorized licensed use limited to: Vardhaman College of Engineering. Downloaded on September 6, 2009 at 08:38 from IEEE Xplore. Restrictions apply.