IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 54, NO. 12, DECEMBER 2007
1067
A “Flying-Adder” On-Chip Frequency Generator for
Complex SoC Environment
Liming Xiu, Senior Member, IEEE
Abstract—The spirit of system-on-chip (SoC) approach is to integrate more and more system functions into one single chip. Consequently, the on-chip clock requirement could be very complicated
due to the various functions the chip has to support. To fulfill those
clock needs, it is not uncommon for more than several phase-locked
loop (PLLs) to be used within one such large chip. Designing these
on-chip PLLs is a very challenging task in term of cost and performance. To solve this problem for a HDTV SoC of over 50 millions
transistors, a “flying-adder” architecture based PLL (FAPLL) is
constructed. This generic FAPLL is instantiated multiple times in
this SoC for different functions, resulting in significant chip cost
reduction.
Index Terms—Flying-adder (FA), frequency synthesis,
phase-locked loop (PLL), voltage-controlled oscillator (VCO).
I. INTRODUCTION
N today’s system-on-chip (SoC) environment a chip can
include many subsystems. As a result, clock requirements
could be very complicated. Many frequencies are required to
be generated on-chip for ensuring sophisticated operations.
One example is a multi-millions-gates HDTV chip which integrates the functions of MPEG2 decoder, NTSC video decoder,
OSD, Graphics Accelerator, AC3 audio processor on chip.
To fully support these functions of many frequencies, several
phase-locked loops (PLLs) are needed. If care is not taken, the
implementation difficulty associated with these PLLs could
easily reach the level of making the chip impractical. In other
words, the cost and technical barricade of clock implementation
alone will be enough to kill the legality of the chip.
During the design process of this large SoC, investigation
has been done on integer- , fractional- PLL and all-digital
PLL (ADPLL) architectures [1]–[3]. To generate all the frequencies required, either several cascaded PLLs have to be used
(integer- PLL) or compensation circuit needs to be incorporated inside PLL (fraction- PLL). The drawbacks with these
implementations are larger size, or greater analog complexity
without noticeable performance gain.
In this brief, a “flying-adder” architecture based PLL
(FAPLL) design approach [4]–[6] is demonstrated to provide
an elegant alternative to this challenge. Compared to conventional PLL based synthesis techniques, it has many unique
features. The most predominating advantage is its capability of
generating frequencies. This is graphically depicted in Fig. 1.
As shown, the “flying-adder” frequency synthesizer can be
I
Manuscript received June 19, 2007. This brief was recommended by Associate Editor S. Pavan.
The author is with the Texas Instruments Incorporated, Dallas, TX 75243
USA (e-mail: limingxiu@ti.com).
Digital Object Identifier 10.1109/TCSII.2007.906943
Fig. 1.
“Flying-adder” architecture: more frequencies.
viewed as a phase divider which can provide additional level
of frequency resolution. Moreover, with the aid of a technique
called “post divider fractional bit recovery (PDFR),” even
more frequencies can be generated. Furthermore, if fractional
number is allowed on the frequency control word, any asked
frequency can be produced. In this SoC, this distinguishing
feature of ample frequencies is used to solve many difficult
problems. The five new contributions of this brief are: 1)
PDFR for generating more frequencies; 2) PDFR for reducing
the number of voltage-controlled oscillator (VCO) stages; 3)
building multiple synthesizers on one VCO; 4) using the fine
resolution of this architecture to achieve the VCXO function;
5) using fine resolution for frame rate synchronization.
In this brief, Section II describes the working principle and
structure of the FAPLL. Section III discusses the new contributions. Section IV reports the measurement result. Section V is
the conclusion.
II. FAPLL ARCHITECTURE
A. Working Principle of FAPLL
As depicted in Fig. 2, the FAPLL is based on a conventional
PLL plus a “flying-adder” synthesizer. The frequency transfer
function of the synthesizer can be expressed as [6]
(1)
Where is the desired period (frequency),
is the frequency control word and is the time difference between any
1549-7747/$25.00 © 2007 IEEE
Authorized licensed use limited to: TEXAS INSTRUMENTS VIRTUAL LIBRARY. Downloaded on April 9, 2009 at 23:19 from IEEE Xplore. Restrictions apply.
1068
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 54, NO. 12, DECEMBER 2007
Fig. 4. PDFR.
Fig. 2. FAPLL.
a “phase divider.” Its highest achievable frequency is only limited by the speed of digital logic.
The loop bandwidth is designed around 2 MHz due to the
input of a 27-MHz crystal. The PLL loop is designed in such
way that the VCO will lock to the input reference in less than 10
s after the power up. Also, the loop is targeted primarily at minimizing the impact of the noise from the input side. The VCO
is intentionally constructed to operate at narrow range, making
it less sensitive to the control voltage so that better overall jitter
performance can be achieved.
C. “Flying-Adder” Cycle-Prolong and Its Impact
Fig. 3. PLL/VCO structure.
two adjacent VCO output phases. When the VCO is locked to
the input , the following relationships will be established:
(2)
(3)
(4)
Without the “flying-adder” synthesizer, the PLL’s output and
input relationship is shown as
(5)
Comparing (4) and (5), the FAPLL’s advantage can be appreciated immediately. Since two more variables (
and
) have been introduced into the equation, the solution space is
greatly enriched. Furthermore, if fractional number is allowed
, this FAPLL can achieve virtually any frequency dein
sired.
B. VCO Structure and the Loop
The VCO structure chosen for this FAPLL is shown in Fig. 3.
As shown, the basic delay stage inside the VCO is two crosscoupled NAND gates. Four of those stages form the VCO, with
. The supply of the NANDs comes
eight outputs P1-8
from the loop control voltage
so that the VCO oscillation
frequency can be adjusted with
. The VCO is optimized
around 700 MHz to 1.4 GHz for small silicon area. Also, the
number of delay stages is minimized to four so that area can be
further reduced and layout matching among all the stages can
be better achieved. Although the VCO is designed around 0.7
to 1.4 GHz, the synthesizer can boost its output well above 2
GHz. Unlike conventional PLL, where the highest output frequency is limited by the VCO’s oscillating capability, FAPLL
can generate much higher frequency than that of VCO since it is
When fractional number is used, the output clock signal will
have a prolonged cycle from time to time. Assume
, where is an integer and is a fraction, then the period
(frequency) of the output signal is
. Structurally,
the output waveform is composed of two types of cycles:
and
. Whenever cycles of
occur, there
will be a cycle-to-cycle jitter of . However, this introduced
“jitter” is deterministic in nature and is in the safe direction for
digital operation. This is owed to the fact that it makes the setup
constraint easier to meet. For hold-check, the cycle-prolong is
irrelevant since hold-check uses only one clock edge. Therefore,
the “jitter” associated with fraction has no impact on digital
operation. This is the crucial difference between “flying-adder”
cycle-prolong and conventional jitter.
III. NEW CONTRIBUTIONS AND THEIR APPLICATIONS
A. PDFR
As addressed above, when frequency control word
contains fractional part, the synthesizer’s output
will bear
cycle-prolong due to the periodical carry-in from the accumulator. In certain cases, the prolonged cycles can be recovered
by the post divider . The working mechanism can be demonstrated through the example of Fig. 4.
In Fig. 4, the top waveform shows the output signal at with
. Due to the fraction 0.25, the signal has one
long cycle of
and three short cycles of
in every four
cycles. Thus, it contains cycle-to-cycle “jitter” of . However,
if post divider
is set at 4, the output signal at will have a
fixed period of
for all its cycles, free of such jitter. This
technique is called PDFR.
In general, if
has fractional part and
is one of
the ’s factors, the irregular cycle caused by can be recovered by post divider . For any given , we can first find all its
factors. Then, the inversions of all these factors and their one’s
complements
can safely be used in the fractional part
of
without negative impact. For example of
,
its factors are 2, 4, 8, and 16. Thus, fraction 0.5, 0.25, 0.75,
0.125, 0.875, 0.0625, and 0.9375 can all be used. This feature
Authorized licensed use limited to: TEXAS INSTRUMENTS VIRTUAL LIBRARY. Downloaded on April 9, 2009 at 23:19 from IEEE Xplore. Restrictions apply.
XIU: A “FLYING-ADDER” ON-CHIP FREQUENCY GENERATOR FOR COMPLEX SoC ENVIRONMENT
Fig. 5. Number of recoverable fractions for each
M.
1069
Fig. 6. Multiple independent clocks from one VCO.
TABLE I
COMPARISON BETWEEN 32 OUTPUTS AND 8 OUTPUTS
Fig. 7. Using FAPLL to achieve VCXO function.
can be utilized to produce more frequency points. Fig. 5 shows
between 2
the number of fractions that are recoverable for
and 1000. For
, there is only one recoverable fraction
and
collapse. For all the prime numof 0.5 since
and its one’s
bers, there are only two recoverable fractions (
complement).
From (4), it is derivable that
where
is a constant when all its factors are fixed.
Therefore, the new frequencies added by the recoverable fracand
.
tions lie between
Furthermore, these frequencies are distributed almost linearly
and
, proportional to the magnitude of . This
between
can be shown as follows:
B. Using PDFR to Reduce VCO Stages
“Flying-adder” architecture is based on multiple VCO outputs, or stages. The more stages there are the more frequencies
that can be generated. However, more stages will be translated
into higher implementation cost. PDFR can be used to reduce
the number of VCO stages without sacrificing the resolution (assuming the same stage delay, or ). This can be demonstrated
by Table I.
In this table, first two columns show the relationship between
control word and output frequency for the case of VCO32 (a
VCO with 32 outputs). When control word F32[5:0] varies
from 000010b to 111111b, the output frequency (Period32) is
to
. The rest columns are for the
changed from
case of VCO8 (a VCO with 8 outputs). When F8[3:0] varies
from 0010b to 1111b, the output frequency (Period8) shifts
to
. However, if we add a divider
after
from
the synthesizer and borrow two bits from the fractional part
(F8_f[1:0]), we can produce frequencies
from
to
. This technique has been used in constructing
this SoC’s video PLL to mimic a VCO32 FAPLL used in
previous projects (to accommodate legacy issues).
C. Building Multiple Synthesizers on One VCO
Unlike conventional PLLs, where each PLL can only produce one independent clock, FAPLL can support multiple independent clocks from one VCO since several synthesizers can
be constructed from the same VCO. This can reduce the PLL
count and is extremely helpful in trimming the cost for large
SoCs. Fig. 6 shows one example of such implementation. From
one VCO, three independently controllable clocks are generated: clk_usb which is directly derived from VCO; clk_ddr and
clk_arm are based on two synthesizers which are separately conand
, respectively.
trolled by
D. Using Fine Resolution to Achieve VCXO Function
From (4), it is understandable that fine frequency resolution
. Furthermore,
can be achieved at if fraction is used in
varies in small region the
it can be proved that when
clock frequency follows the
’s change linearly. An example is shown in Fig. 7. This feature can be used to replace
external VCXO chip or on-chip VCXO component in clock recovery system. It has been utilized in this SoC with significant
cost reduction since this feature is “free” with FAPLL.
Authorized licensed use limited to: TEXAS INSTRUMENTS VIRTUAL LIBRARY. Downloaded on April 9, 2009 at 23:19 from IEEE Xplore. Restrictions apply.
1070
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 54, NO. 12, DECEMBER 2007
Fig. 8. Overall on-chip clock structure.
Fig. 9. Jitter measurement at 148.5 MHz.
E. Fine Resolution and Frame Rate Synchronization
The fine resolution of FAPLL can also be used to solve problems for system which requires nonstandard frequencies, such
as the frame rate synchronization problem in HDTV application. In TV broadcasting, different contents (movies, advertisements, sports, etc) can be broadcasted in different frame rates,
such as 60 or 59.94 Hz. However, the pixel clock, which is used
to drive the display device for displaying the contents, is usually
obtained from PLL which can only produce several commonly
used frequencies. For example, in HDTV 720p mode, the pixel
clock required for 60-Hz frame rate is 74.25 MHz and the PLL
is designed for this frequency. But for 59.94 Hz, the needed frequency is 74.17575 MHz. To solve this problem, a video frame
can be added or deleted once in a while in the video stream
to match the rate, but resulting in visible artifact. Or dedicated
crystal or VCXO can be added just for accommodating the odd
frame rates. With its fine resolution, FAPLL can easily solve this
problem by generating the 74.17575 MHz precisely.
F. FAPLLs in This SoC
This large HDTV SoC has several subsystems; each has its
own clock requirement. The key design challenge is to reduce
the number of PLLs without sacrificing the performance. Fig. 8
is the overall on-chip clock structure of this SoC. In this section,
the FAPLL-oriented implementations will be demonstrated.
The system-clock domain presents the simplest design constraint. It only requires 216, 108, 54, 27, and 13.5 MHz. Thus,
the VCO is set to run at 864 MHz and five dividers, along
with phase alignment circuit and glitch-free clock switches,
are used to produce those frequencies simultaneously. For this
clock domain, the challenging issue is to constantly adjust
the PLL to track the clock embedded inside the video stream.
This is elegantly realized by the FAPLL’s VCXO capability of
Section III-D.
In terms of frequencies required, audio-clock domain
presents the toughest challenge. Table II shows all the frequencies needed for this SoC’s audio application. The first column
is the sampling frequency in kilohertz, whereas the first row is
the over-sample rate. The numbers presented in the table is the
audio clock frequencies in megahertz.
Conventionally, to generate all these frequencies, two or three
cascaded PLLs are required. By using FAPLL, this problem
can be solved by just one component. As shown in Fig. 8, the
input reference of this audio PLL is 86.4 MHz, which can be
obtained from the system PLL’s VCO frequency of 864 MHz.
TABLE II
THE AUDIO CLOCK FREQUENCIES
By setting the
,
and
appropriately, all the required frequencies can be generated under the constraints of:
VCO in the optimized range of 700 MHz to 1.2 GHz, in the
range of 17.28 to 28.8 MHz. For the example of 45.1584 MHz,
MHz ,
one setting could be:
MHz ,
MHz
and
MHz . During the design, an
algorithm has been developed to search for the
,
and
setting based on (4). The PDFR technique introduced
in Section III-A is also incorporated in the algorithm. For a requested frequency, the algorithm will likely produce more than
one solution. For those cases, it can select the best-fit solution
based on user provided criterions, such as
being in the
middle of the optimized range, PFD frequency
being reasonably high, synthesizer output frequency
being as low as
possible, output signal ’s duty cycle being as close to 50%
as possible, etc. The priority of those criterions will determine
which solution wins.
The ARM/DDR PLL needs to support three independent
clocks:
,
and
. From 100
to 250 MHz, several frequencies are required. However, the
precise frequency is not important as long as there are enough
frequency points available in this range to guaranty the system
operation. The more frequencies there are in the range, the more
flexibility the system can enjoy in the operation. The structure
of multiple synthesizers on one VCO (Fig. 6) has been used to
fulfill this need with minimum cost. Moreover, PDFR has been
utilized to improve the available frequency points in this range
from 21 to 60.
Authorized licensed use limited to: TEXAS INSTRUMENTS VIRTUAL LIBRARY. Downloaded on April 9, 2009 at 23:19 from IEEE Xplore. Restrictions apply.
XIU: A “FLYING-ADDER” ON-CHIP FREQUENCY GENERATOR FOR COMPLEX SoC ENVIRONMENT
1071
Fig. 10. Spectrum measurement for three frequencies.
The display PLL is used to drive the display engine which
supports the display modes of 480i, 480p, 720p, 1080i, 1080p
and etc. The major frequencies required are: 13.5, 27, 31.5, 36,
33.75, 40, 50, 49.5, 56.25, 74.25, 75, 78.75, 85.5, 94.5, 108,
148.5, 135, 156, 157.5, and 162 MHz. These frequencies can all
be generated by the FAPLL without the use of fractional bits in
. Additional frequencies, such as 25.175, 35.5, 50, 56.25,
65, 68.25, 75, 79.5, 101, 102.25, 117.5, 121.75, are required for
some graphic modes. They can be produced with the help of
the fractions. The fine resolution of FAPLL has also been used
to solve the frame rate synchronization problem as discussed in
Section III-E.
Video PLL is used for an on-chip video decoder that converts
analog NTSC composite signal to digital component video
signal. In this application, the frequencies required are not
predetermined but could be any value in real applications. Due
to this special requirement FAPLL is the only choice; there are
hardly any other alternatives. As mentioned in Section III-B,
the PDFR technique is used in Video PLL to make the 8-phases
FAPLL mimic the legacy 32-phases FAPLL in previous
projects.
is not mismatch in layout (the mismatch is deterministic and will
show up as spurs); 2) the PDFR faithfully recovers the “flyingadder” cycle-prolong with no spur resulted in output clock.
V. CONCLUSION
“Flying-adder” frequency synthesis architecture was invented
several years ago. Since then, many improvements have been
made in circuit level. In this brief, a FAPLL design approach
has been established in system level which initiates a new direction in PLL design. This is mainly due to the design-solution-space enrichment and the fine resolution provided by the
new approach. The powerfulness of this new FAPLL method
has been demonstrated through a real example of a large SoC
with the achieved goal of “cheaper, better, and faster.”
ACKNOWLEDGMENT
The author thanks G. Cook, D. Dudek, J. Nave, G. Xu, and
B. Parthasarthy for their invaluable help.
REFERENCES
IV. THE TEST REPORT
The SoC which contains multiple FAPLLs has been used in
real TV application with no clock-related problem reported. The
size of the FAPLL (Fig. 2) is
in a 90-nm CMOS
process. Its supplies are: 1.8 V for analog circuitry, 1.1 V for
digital blocks. The power consumption is around 10 mW. The
jitter measurement at one HDTV frequency 148.5 MHz is pk-pk
90 ps as shown in Fig. 9. The spectrum measurement has been
performed for all the audio frequencies listed in Table II. Fig. 10
is the plots of three frequencies 6.144 (left), 45.1584 (center),
and 65.536 MHz (right). For the 65-MHz case, the setting of
and
is used. In this case, the fraction
is used and it is recovered by the . All the other frequencies’
spectrums look similar. These measurements show that: 1) there
N
[1] J. Hakkinen and J. Kostamovaara, “Speeding up an integer- PLL by
controlling the loop filter charge,” IEEE Trans. Circuits Syst. II, Analog
Digit. Signal Process., vol. 50, no. 7, pp. 343–354, Jul. 2003.
[2] S. E. Meninger and M. H. Perrott, “A 1-MHz bandwidth 3.6-GHz
0.18-um CMOS fractional- synthesizer utilizing a hybrid PFD/DAC
structure for reduced broadband phase noise,” IEEE J. Solid-State
Circuits, vol. 41, no. 4, pp. 966–980, Sep. 2004.
[3] R. B. Staszewksi and P. T. Balsara, “Phase-domain all-digital phaselocked loop,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 52, no. 3,
pp. 159–163, Mar. 2005.
[4] H. Mair and L. Xiu, “An architecture of high-performance frequency
and phase synthesis,” IEEE J. Solid-State Circuits, vol. 35, no. 6, pp.
835–846, Jun. 2000.
[5] L. Xiu and Z. You, “A ‘Flying-Adder’ architecture of frequency and
phase synthesis with scalability,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 10, no. 5, pp. 637–649, Oct. 2002.
[6] L. Xiu and Z. You, “A new frequency synthesis method based on
‘Flying-Adder’ architecture,” IEEE Trans. Circuits Syst. II, Analog
Digit. Signal Process., pp. 130–134, Mar. 2003.
N
Authorized licensed use limited to: TEXAS INSTRUMENTS VIRTUAL LIBRARY. Downloaded on April 9, 2009 at 23:19 from IEEE Xplore. Restrictions apply.