Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A Novel Robust Signaling Scheme For High-Speed Low-Power Communication Over Long Wires

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

A Novel Robust Signaling Scheme for High-Speed Low-Power Communication over Long Wires

Marshnil Dave, Maryam Shojaei Baghini, Dinesh Sharma IIT-Bombay

AbstractThis paper describes a new capacitively coupled driver and a receiver with a new analog equalizer for high-speed low-power communication over long on-chip wires. The proposed signaling scheme improves upon state of the art capacitively driven interconnect based scheme for enhanced robustness and energy efciency of the signaling scheme. The proposed signaling scheme has been designed in 90nm CMOS process. Simulations indicate that the proposed scheme can transmit and receive data at the rate of 3.22Gbps over a 10mm long channel while consuming only 0.107pJ/bit. This is the lowest reported energy/bit for high-speed on-chip communication over long on-chip wires. Monte Carlo and process corner simulations show that the proposed scheme allows up to 3Gbps of data rate even in the presence of intra-die and inter-die process variations.

better equalization. From Table I, the scheme proposed in [7] consumes less energy than the scheme given in [6] even for a 10mm long wire. The transmitter proposed in [5] injects an input pattern dependent current into the wire performing a 3-tap feed forward equalization (FFE) using switched current sources. The 3-tap FFE allows data-rates up to 4Gbps even for a 10mm line but consumes more energy than the scheme proposed in [7]. The scheme given in [7] is the most energyefcient among the published schemes. However, there are some limitations of this scheme which will be explained in the next section.
Technology Length (mm) Data Rate (Gbps) Energy/bit (fJ/bit) Data Rate Density (Gbps/m) FoM Gbps/m/pJ/bit

I. I NTRODUCTION With increased speed of logic circuits on a chip and increased die-sizes, bandwidth of on-chip long wires have become determining factor in overall speed of high performance integrated circuits. Traditionally buffer insertion technique has been used to allow high data-rate communication over long wires. However, buffer inserted long wires consume huge power [1], [2]. Recently many alternate signaling schemes have been suggested. Among these, the signaling schemes with low voltage swing on the line and equalization at transmitter and/or receiver are the most promising for high speed energyefcient communication [3][7]. These schemes employ circuits in the transmitter and/or receiver to compensate for high frequency losses in the channel so as to allow very high datarates. Also, the voltage swings on the line are reduced to hundreds of millivolts which leads to very low dynamic power consumption in the wire. Table I shows the reported datarate/channel and total energy/bit/channel of the most promising signaling schemes. The signaling schemes of [4][7] and ours are differential in nature for low sensitivity to noise. The scheme proposed in [4] employs a series capacitor at the transmitter end for channel equalization. In this scheme the AC coupled transmitter blocks low frequency components of input bit stream. A prototype of the scheme relies on input bit pattern having nearly equal number of 1s and 0s [4]. However, input bits can have any number of 1s and 0s in practice. The capacitively coupled transmitter based schemes presented in [6] and [7] have a direct coupled weak driver (in addition to the capacitively coupled driver) for low frequency components of the input and/or a resistive termination at the receiving end. This allows a reliable communication for all activity rates of input bit sequence. The scheme proposed in [6] employs three series capacitors in the transmitter and a series capacitor at the receiving end for better equalization. On the other hand, the scheme proposed in [7] employs an analog decision feedback equalizer in the receiver for

Our Work scheme of [7] Proposed 180nm 90nm 90nm 90nm 90nm 90nm 10 10 5 10 10 10 1 4 4.9 2 2.94 3.22 1 356 340 280 240 107









5.61 12.94 4.15

In this paper, we propose an improved capacitively coupled driver and a receiver with new analog equalizer. With the proposed transmitter receiver pair, data rates up to 3.22Gbps and data rate density of 1.87Gbps/m at energy consumption of 0.107pJ/bit are achieved for communication over a 10mm long channel considering nominal values of transistor parameters. Process corner and Monte Carlo simulations show that throughput of the proposed signaling scheme reduces by at the most 13% even in the extreme case of intra-die and inter-die parameter variations. Organization of this paper is as follows: In Section II signaling scheme proposed in [7] is explained. Limitations of this scheme are also discussed in this section. In Section III the proposed signaling schemes working and robustness are presented. In Section IV performance of the proposed signaling scheme is discussed. Section V reports major conclusions of this work. II. S IGNALING S CHEME P ROPOSED IN [7] Most area-efcient on-chip long interconnects exhibit lowpass lter like behavior in modern CMOS technologies [8]. A capacitor connected in series with the wire can be used to compensate for high frequency losses in the wire. Fig. 1 shows the capacitively coupled transmitter based differential signaling scheme proposed in [7]. It shows a complete transmitter circuit and a termination circuit of the scheme. In the

978-1-4673-1036-9/12/$31.00 2012 IEEE


13th Int'l Symposium on Quality Electronic Design



Wire Vlinerxp Mn 6 uA Cs

RL= 16K


Sense Amplifier Based Flipflop With Equalizer

Doubletail Sense Amplifier

Output Latch and Analog Equalizer

S R NOR Latch
OutN OutP



Fig. 1. Capacitively Coupled Transmitter based Differential Signaling Scheme Proposed in [7]



Mfbp Mfbn

transmitter tapered inverter chains are connected to the long on-chip wires through series capacitors. The series capacitors pre-emphasize the input transitions [7]. N-channel MOSFETs, Mn and Mnb, and P-channel MOSFETs, Mp and Mpb, form paths for low frequency components of the input. When input is logic 0 (IN=0), the transistor Mn sinks a small current (Istatic 6A) from the line which sets the voltage on Vlinerxp to Vdd -Istatic RL . Here, RL is ON-resistance of P-channel MOSFET, Mp. Also, transistor Mnb is OFF, as a result Vlinerxn is pulled up to Vdd . Similarly, when input is logic 1 (IN=1), node Vlinerxp is pulled up to Vdd and the voltage on Vlinerxn is set to Vdd -Istatic RL . In other words, the differential voltage developed across the terminating resistors is Istatic RL at low data rates. At the receiving end a sense amplier based ip-op (SAFF) amplies the differential voltage developed across the terminating transistors and takes it to digital logic levels. The receiver proposed in [7] employs an analog equalizer also for better compensation of the wire response. Fig. 2 shows the receiver circuit, proposed in [7]. It employs a double-tail sense amplier, a NOR latch and an analog equalizer. The analog equalizer subtracts the residue voltage corresponding to the previous bits on the line from the present bit. Outputs of the NOR latch are directly connected to a lowpass RC lter which mimics line characteristics. Output of the low-pass lter is connected back to the double-tail sense amplier using another differential pair (Mfbn and Mfbp) [7]. Limitations of the Scheme Proposed in [7]

Vlinerxp clock



Fig. 2.

Receiver proposed in [7]

voltage of Vdd /2. In all, speed and power consumption of a capacitively coupled transmitter based scheme can be improved by setting line common-mode voltage near Vdd /2 and using SAFF-Con in the receiver. Due to near-supply line common-mode voltage, the line voltage near transmitter end exceeds Vdd during the transitions which can cause long-term reliability issues especially in shorter technology nodes. In the receiver circuit, the low pass RC lter is directly connected to the output of the NOR latch. The low pass lter is designed to mimic channel response for better equalization and hence it loads the NOR latch signicantly. Moreover performance of the signaling scheme in the presence of intra-die and inter-die process variations is not discussed in the paper.

III. P ROPOSED C APACITIVELY C OUPLED T RANSMITTER AND A NALOG E QUALIZATION BASED R ECEIVER Fig. 3 shows an improved capacitively coupled transmitter based signaling scheme proposed in this paper. The proposed transmitter employs tapered inverter chains connected to the line through series capacitors. Each series capacitor is realized using two source-drain shorted P-channel MOSFETs. The source/drain terminals and gate terminals of the P-channel MOSFETs are cross connected. This is to ensure that for input transitions from 0 to 1 and 1 to 0 both, effective capacitors seen by the transmitter are equal. It employs weak drivers which are directly connected to the line. The weak drivers are essentially switched current sources biased by a simple bias generator. They source or sink a small current (2.8A) from the line depending upon input being logic 1 or 0. The line is terminated by active resistors made up of transmission gates at the receiving end. The line commonmode voltage is set to be around Vdd /2 by a voltage divider made up of PMOS transistors. The MOSFETs used in the voltage divider and the bias generator circuit in the transmitter

The scheme proposed in [7] sets DC common-mode voltage of the line close to Vdd . The conventional senseamplier (SAFF-Con) based ip-op has large input offset voltage and is slow for near-supply input commonmode voltage [9]. Hence, a double-tail sense-amplier (SAFF-DT) that can operate at high data rates even for near-supply input common-mode voltage is preferred in [7]. However, the double-tail sense-amplier has two stages unlike the conventional sense-amplier and hence it is architecturally slower and more power consuming than the conventional sense amplier for input commonmode voltage of Vdd /2. Our analysis shows that the conventional sense-amplier consumes 35% less power than the double-tail sense-amplier when both are optimized for speed and input offset for input common-mode









Weak Driver


Vb Vb




Conventional Sense Amplifier Based FlipFlop With Equalizer

weak weak





the proposed signaling scheme consumes a very small amount of static power. In the proposed receiver an improved NAND latch, given in [10], is used to enhance speed of the circuit. In the traditional NAND latch, when inputs change from 11 to 01 (or 10), the two outputs change sequentially which makes the latch slow [10]. In the improved NAND latch, the outputs change simultaneously when the latch switches from hold state to transparent state. Fig. 5 shows voltage waveforms at key nodes of the proposed signaling scheme. Output of the analog equalizer follows the line voltage with a delay of 1 clock cycle.




Fig. 3. Proposed Modied Capacitively Coupled Transmitter Based Differential Signaling Scheme

have small aspect ratios and hence consume very small static current (5A).
Conventional SenseAmplifier Output Latch and Analog Equalizer



Improved NAND Latch


Idc Mpr


R DrainP









Mn clock clock Idc Mnr


Fig. 5. Voltage Waveforms at the key nodes of the Proposed Signaling Scheme (Line Length=10mm and Data Rate=3.22Gbps)

A. Robustness of the Proposed Scheme Throughput of the proposed signaling scheme is a strong function of following parameters. Low data rate voltage swing on the line High frequency boost provided by the series capacitors Feedback from the analog equalizer The low data rate voltage swing on the line is given by Istatic RL where Istatic is current supplied by the weak drivers and RL is resistance of the terminating transmission gates. The current sources in the weak drivers are biased by a voltage divider made up of diode-connected P-channel and N-channel MOSFETs. In the extreme case of process variation where both P-channel and N-channel MOSFETs are slow (SS) or fast (FF), the bias voltage does not change and hence Istatic is decreased or increased from its nominal value. On the other hand the resistance of the terminating transmission gates is also increased/decreased from its nominal value in SS/FF process corner. As a result, in SS and FF corners Istatic RL remains nearly the same as designed in the nominal case. In skewed corners, Slow NMOS Fast PMOS (SNFP) and Fast NMOS Slow PMOS (FNSP), the bias voltage Vb changes such that Istatic does not change much from its nominal value.

Fig. 4.

Proposed Receiver Circuit Employing an Analog Equalizer

At the receiving end a conventional sense-amplier based ip-op amplies the small voltage swing on the line and takes it to the digital logic levels. Fig. 4 shows the circuit diagram of the proposed receiver where a novel analog equalizer is also employed. Outputs of the NAND latch control switches (Mp, Mpb, Mn, Mnb) connected to a low pass RC lter. MOSFETs Mp, Mpb, Mn and Mnb present a small capacitive load to the output latch. The low pass RC lter is realized using MOSFETs as capacitors and a transmission gate. When outP is logic 1 and outN is logic 0, transistors Mpb and Mn are ON. Hence, there is a current owing through transistors MprMpb-Mn-Mnr. Similarly when outP is logic 0 and outN is logic 1, there is a current owing through transistors MprMp-Mnb-Mnr. The differential voltage developed across Vfbp and V-fbn corresponds to the residue voltage on the line due to previous bits. This is given to the input of the sense amplier via another differential pair. The through current (Idc ) in the analog equalizer is very small (4A). Hence

Also, in the skewed corners RL does not deviate much from its designed value. Hence, Istatic RL remains nearly the same as its nominal value in the skewed corners as well. The high frequency boost depends on the output impedance of the last inverters in the inverter chains and the value of series capacitance. In SS and FF corners, the decrease/increase in the strength of the inverters is partly compensated by increase/decrease in the series capacitance. In the skewed corners, the difference in the strengths of the PMOS and NMOS transistors is partly compensated by differential nature of the proposed scheme. As a result eye-height at the receiving end remains nearly the same in all process corners. The feedback gain in the equalizer depends on the differential voltage between V-fbp and V-fbn. This voltage is controlled by the through current,Idc and resistance of the transmission gate, in the analog equalizer. In SS and FF corners decrease/increase in Idc is compensated by increase/decrease in the resistance of the transmission gate. In the skewed corners, deviation in both Idc and resistance of the transmission gates from their nominal values is small. Throughput of the signaling scheme is a function of RC time constant of the equalizer. However, practical variation in RC time constant does not degrade throughput of the proposed signaling scheme signicantly. IV. P ERFORMANCE OF P ROPOSED S IGNALING S CHEME The capacitively coupled transmitter based signaling scheme (CC-LVS) presented in [7] and the modied signaling scheme (CC-LVS-Mod) proposed in this paper have been designed in 90nm CMOS process with Vdd =1V using nominal Vt devices. The signaling schemes were designed to drive 10mm long optimally twisted differential wires [8]. The wire geometries and hence resistance and capacitance of the wires were chosen to be the same as given in [7] and [11]. Both the signaling schemes were designed for the worst-case differential eyeheight of 16mV at data-rates of 1.25Gbps without equalizer in the receiver. CC-LVS is designed for a low data rate line voltage swing of 96mV and the series capacitors of 311fF as suggested in [7]. The effective capacitance offered by the Nchannel MOSFETs was found from a foundry specied document on Cgate -Vgate of MOSFETs. CC-LVS-Mod is designed for low data rate line voltage swing of 46mV and hence it requires series capacitors of 150fF. The termination resistance in both schemes is chosen to be 16K. Both the signaling schemes were designed such that their transmitters offer input capacitance equivalent to one minimum sized inverter and the receivers drive load capacitance of 4fF. The analog equalizers in the receivers of CC-LVS and CC-LVS-Mod schemes have been designed so as to maximize the achievable data-rates. A. Performance in Nominal Case The capacitively coupled transmitter based signaling schemes, CC-LVS and CC-LVS-Mod, have two separate paths for high frequency components and low frequency components of the input. As a results, the input bit patterns 0001000 and 1110111 cause very little eyeheight at the receiving end. Performance of the two signaling

schemes is analyzed for the above mentioned worst case bit sequences and random bit sequence with activity factor of 0.5 (p=0.5). The worst case bit sequences ( 0001000 and 1110111 ) are less likely to occur in the bit sequence generated by a Markov source with p=0.5 and hence they were separately generated using piecewise linear source. Fig. 6 shows the eye-diagram of the differential voltage at the receiving end of the line in CC-LVS and CC-LVS-Mod for data rate of 1.25Gbps. The minimum eye-height in CC-LVS and CCLVS-Mod both is around 16.5mV (Fig. 6.a and Fig. 6.d). In CC-LVS, eye-height and eye-width both (eye-opening) for the worst case bit pattern is much smaller (eye-height=16.67mV) as compared to random bits with p=0.5 (eye-height=29.62mV). This is indicative of suboptimal design of CC-LVS. CC-LVSMod is designed such that eye-height for the worst case bit sequences and eye-height for random bits with p=0.5 are nearly equal (equally open two eyes in Fig. 6.c). CC-LVS-Mod requires series capacitors of only 150fF for this design. As a result, for 1000 random bits with 50% of activity rate results in eye-height of 17.55mV in CC-LVS-Mod. Table II shows

Fig. 6. Eye-diagram of the differential voltage at the receiving end of a 10mm line at 1.25Gbps

the maximum achievable data rate (throughput) and energy/bit of the two signaling schemes without the equalizer in the receiver. Throughput is found by sending 1000 random bits with activity factor of 0.5 as well as above mentioned worst case bit sequences. Table II shows that the proposed signaling scheme CC-LVS-Mod can achieve 14% higher data rate than CC-LVS. This is primarily due to better speed and resolution of SAFF-Con as compared to SAFF-DT. In [7] measurement results are reported whereas in this paper simulation results are reported. This leads to minor difference in the results of

Reported in [7] Our Work CC-LVS CC-LVS CC-LVS-Mod 1.35 1.37 1.56
260 160 100
226 160 066
97.0 66.4 30.6

Max. Data Rate (Gbps) CC-LVS CC-LVS-Mod 1.37 1.56 1.37 1.58 1.23 1.56 1.11 1.53 1.38 1.53

Max. Data Rate (Gbps) Energy/bit @1.35Gbps Total Avg. (fJ/bit) Transmitter and Wire (fJ/bit) Receiver (fJ/bit)


CC-LVS reported in [7] and designed by us. Table II also reports energy consumed by the two schemes for random bits sequence with 50% of activity factor. It indicates that the total energy/bit consumption of CC-LVS-Mod is around 55% less than that of CC-LVS. In the proposed signaling scheme the modied design of the transmitter and SAFF-Con both consume around 50% less energy than transmitter and receiver circuits of CC-LVS. Table III shows performance of the two signaling schemes with analog equalizer. It shows that throughput of both the signaling schemes can be enhanced by more than 70%. by analog equalizer at the receiving end. It shows that the additional energy consumption in the receiver due to analog equalizer is very small.
TABLE III P ERFORMANCE OF CC-LVS AND CC-LVS-M OD WITH A NALOG E QUALIZER FOR N OMINAL P ROCESS PARAMETERS Reported in [7] Our Work CC-LVS CC-LVS CC-LVS-Mod Max. Data Rate (Gbps) 2 2.94 3.22 Energy/bit @2Gbps Total (fJ/bit) 280 240 107 Transmitter and Wire(fJ/bit) 160 160 067 Receiver (fJ/bit) 120 080 040

SNFP and FNSP. Speed of logic circuit degrades by around 20% in SS corner [12] and hence, reduction of 10% in the throughput of CC-LVS in SS corner should be acceptable. However, in SNFP and FNSP corners interconnects become bottleneck with CC-LVS scheme. Throughput of the proposed scheme remains nearly the same in all corners except SS corner. Throughput dgradation in SS corner is 16% which is less than logic frequency reduction in SS corner.
TABLE V P ERFORMANCE OF CC-LVS AND CC-LVS-M OD WITH A NALOG E QUALIZER IN D IFFERENT P ROCESS C ORNERS Max. Data Rate (Gbps) CC-LVS CC-LVS-Mod Nominal 2.94 3.22 FF 2.94 3.22 SNFP 2.70 3.22 FNSP 2.63 3.22 SS 2.63 2.70

B. Effect of Parameter Variations Throughput of the two signaling schemes without analog equalizer in four process corners is shown in Table IV. Throughput of CC-LVS-Mod degrades by at the most 2%. Thropughput of CC-LVS remains as desired in SS and FF corners but it degrades signicantly in skewed process coerners. In FNSP corner, low data rate voltage swing on the line is more as compared to the nomical case. As a result, for the worst case bit sequences eye-height becomes very small which leads to degradation in throughput. In SNFP corner, the low data rate voltage swing on the line is reduced which leads to overshoot of line voltage. As a result, for random bits eye-height becomes very small which leads to reduction in throughput. In SNFP and FNSP corners the logic frequency degrades by only 3% and hence 10% to 18% reduction in throughput of CC-LVS can cause interconnects to be bottleneck in overall performance of the chip [12]. Table V shows throughput of the two signaling schemes with analog equalizer in different process corners. In the analog equalizer of CC-LVS an ideal current reference which controls gain of the feedback has been used. Even with ideal current reference circuit throughput of CC-LVS degrades by around 10% in SS,

Fig. 7. Effect of Intra-die Variations on Bit Error Rate of CC-LVS-Mod Scheme at 3.12Gbps

The capacitively coupled transmitter based signaling schemes do not employ repeaters even for 10mm long wires. As a result transmitters and receivers of the signaling schemes are very much likely to be in different parts of the chip. Hence, it is important to analyze performance of the schemes in the presence of intra-die variations. We performed Monte Carlo simulations for CC-LVS-Mod scheme for the same. In these simulations, transistors in the transmitter were considered to be correlated and transistors in the receivers were considered to be correlated. But no correlation between transmitter and receiver transistors was assumed. Fig. 7 shows histogram of number of erroneous bits in 200 Monte Carlo simulations at

3.12Gbps. It shows that in few runs more than 1 bits out of 1000 bits were found to be erroneous. Monte Carlo simulations at 3.0 Gbps show zero erroneous bits for all 200 runs. This shows that the proposed signaling scheme is robust against intra-die variations as well. CC-LVS-Mod was also tested for temperature variations. The scheme can work at data rates of 3.12Gbps even at 90 C. The schemes are expected to be robust against supply noise and crosstalk because of differential nature of the scheme. V. C ONCLUSION In this paper a capacitively coupled transmitter based signaling scheme proposed in [7] is analyzed and an improved capacitively coupled transmitter based signaling scheme is presented. A novel receiver with analog equalizer is also proposed in this paper. The proposed signaling scheme allows data rates up to 3.22Gbps over 10mm long wires while consuming only 0.107pJ/bit. Detailed Monte Carlo and process corner simulations indicate that the proposed signaling scheme allows data rates up to 3Gbps even in the presence of intra-die and inter-die process variations. VI. ACKNOWLEDGMENTS Authors would like to thank Tata Consultancy Services (TCS) and government of India (SMDP) for student scholarship and nancing EDA tools. Authors are also thankful to Europractice and Faraday for providing fabrication service and ESD protection circuits, respectively. R EFERENCES
[1] N. Magen, A. Kolodny, U. Weiser, and N. Shamir, Interconnect-power dissipation in a microprocessor, in Int. Workshop on System-Level Interconnect Prediction, April 2004, pp. 713. [2] J. Dwens, W. J. Dally, R. Ho, D. N. Jayasimha, S. W. Keckler, and L.-S. Peh, Research challenges for on-chip interconnection networks, IEEE Micron, 2007. [3] M. Dave, M. Shojaei, and D. Sharma, Energy-efcient current-mode signaling scheme for on-chip interconnects, in Proc. of Asian Solid State Conf., 2010. [4] R. Ho and et. al, High speed and low energy capacitively driven onchip wires, IEEE J. Solid-State Circuits, vol. 43, no. 1, pp. 5260, Jan. 2008. [5] B. Kim and V. Stojanovic, A 4Gb/s/ch 356fJ/b 10mm equalized onchip interconnect with nonlinear charge-injecting transmit lter and transimpedance receiver in 90nm CMOS, in IEEE ISSCC Dig. Tech. Papers, 2010. [6] J. sun Seo, R. Ho, J. Lexau, M. Dayringer, D. Sylvester, and D. Blaauw, High-bandwidth and low-energy on-chip signaling with adaptive preemphasis in 90nm CMOS, in IEEE ISSCC Dig. Tech. Papers, 2010. [7] E. Mensink, D. Shinkel, E. A. M. Klumperink, E. van Tuijl, and B. Nauta, Power efcient gigabit communication capacitively driven RC-limited on-chip interconnects, IEEE J. Solid-State Circuits, 2010. [8] D. Shinkel, E. Mensink, E. A. M. Klumperink, E. van Tuijl, and B. Nauta, A 3-Gb/s/ch transceiver for 10-mm uninterrupted RC-limited global on-chip interconnects, IEEE J. Solid-State Circuits, 2006. [9] D. Shinkel, E. Mensink, E. Klumperink, E. V. Tuijl, and B. Nauta, A double-tail latch-type voltage sense amplier with 18ps setup+hold time, in IEEE ISSCC Dig. Tech. Papers, 2007. [10] B. Nikolic, V. G. Oklobdzija, V. Stojanovic, W. Jia, J. K.-S. Chiu, and M. M.-T. Leung, Improved sense-amplier-based ip-op:design and measurements, IEEE J. Solid-State Circuits, pp. 876884, June 2000. [11] E. Mensink, High-speed global on-chip interconnects and transceivers, Ph.D. dissertation, 2007. [12] M. Dave, M. Shojaei, and D. Sharma, A process variation tolerant, high-speed and low-power current mode signaling scheme for on-chip interconnects, in Proc. of GLSVLSI, 2009, May 2009, pp. 389392.

You might also like