Cordic 1
Cordic 1
Cordic 1
Introduction
Power series may also be used to calculate these same functions with out using
look-up tables, however these calculations have the disadvantage of being slow
to converge to a desired precision. In effect, the look-up table size is being
traded at the expense of computation time.
All CORDIC algorithms are based on the fact that any number may be represented
by an appropriate alternating series. For example an approximate value for e
may be represented as follows:
It may also be shown that the series for e is also irregular if the expansion
is continued for a few additional terms.
B
z = Log10(x) = = a i ∗2 − i
i =1
In this case the values for ai are either 0 or 1 and represent bits in the
binary representation of z. The value for z is determined one bit at a time by
looking at the previously calculated value for z, which is correct to i-1 bits.
If this estimate of z is too low, we correct the current estimate by adding a
correction factor, obtained from a look-up table, to the current value of z. If
the current estimate of z is too high, we subtract a correction factor, also
from the look-up table. Depending on whether we add or subtract from the
current value of z, the ith bit will be set to the correct value of 0 or 1. The
less significant bits from i+1 to B may change during this process because the
estimate for z is only accurate to i bits.
z = x* y
B
= y∗ a i ∗2 − i
i =1
B
= y * a i * 2 −i
i =1
B
= a i * ( y * 2 −i )
i =1
multiply(x,y){
for (i=1; i=<R; i++){
if (x > 0)
x = x - 2(^-i)
z = z + y*2^(-i)
else
x = x + 2(^-i)
z = z - y*2^(-i)
}
return(z)
}
This calculation assumes that both x and y are fractional ranging from -1 to 1.
The algorithm is valid for other ranges as long as the decimal point is allowed
to float. With a few extensions, this algorithm would work well with floating
point data.
B
x− y∗ a i ∗2 − i = 0
i =1
B
x− a i * ( y * 2 −i ) = 0
i =1
This final form of the equation shows that the quotient z may be estimated one
bit at a time by driving x to zero using right shifted versions of y. If the
current residual is positive, the ith bit in z is set. Likewise if the residual
is negative the ith bit in z is cleared.
divide(x,y){
for (i=1; i=<R; i++){
if (x > 0)
x = x - y*2(^-i);
z = z + 2^(-i);
else
x = x + y*2(^-i);
z = z - 2^(-i);
}
return(z)
}
divide_4q(x,y){
for (i=1; i=<R; i++){
if (x > 0)
if (y > 0)
x = x - y*2(^-i);
z = z + 2^(-i);
else
x = x + y*2(^-i);
z = z - 2^(-i);
else
if (y > 0)
x = x + y*2(^-i);
z = z - 2^(-i);
else
x = x - y*2(^-i);
z = z + 2^(-i);
}
return(z)
}
As with all division algorithms, the case where y is zero should be trapped as
an exception. Once again, a few extensions would allow this algorithm to work
well with floating point data.
Algorithms for Log10(x) and 10x
B B
Log 10 ( x * ∏ b I ) = Log 10 ( x) + Log 10 (bi )
i =1 i =1
If the bi are chosen such that x*b1*b2*b3...*bB = 1, we see that the left hand
side reduces to Log10(1) which is 0. With these choices for bi, we are left with
the following equation for Log10(x):
B
Log 10 ( x) = − Log 10 (bi )
i =1
Since quantities for Log10(bi) may be stored in a look-up table, the base 10
logarithm of x may be calculated by summing selected entries from the table.
The trick now is to choose the correct bi such that we drive the product of x
and all of the bi to 1. This may be accomplished by examining the current
product. If the current product is less than 1, we choose co-efficient bi such
that bi is greater than 1. On the other hand, if the current product is greater
than 1 the coefficient should be chosen such that its value is less than one.
An additional constraint is that the bi should be chosen such that
multiplication by any of the bi is accomplished by a shift and add operation.
Two coefficients which have the desired properties are:
and
In choosing these values for the bi, it is seen that the limit as i approaches
infinity of the product of x and the bi's will be 1 as long as x is in the
range:
B B
(∏ (1 + 2 − i )) −1 < x < (∏ (1 − 2 − i )) −1
i =1 i =1
This represents the range of convergence for this algorithm which may be
calculated as approximately:
log10(x){
z = 0;
for ( i=1;i=<B;i++ ){
if (x > 1)
x = x - x*2^(-i);
z = z - log10(1-2^(-i));
else
x = x + x*2^(-i);
z = z - log10(1+2^(-i));
}
return(z)
}
As the exponent is driven to zero, z is seen to approach the product of all the
B
successive coefficients, ∏b
i =1
i . The final algorithm becomes:
10_to_power(x){
z = 1;
for ( i=1;i=<B; i++ ){
if (x > 0)
x = x - log10(1+2^(-i));
z = z + z*2^(-i);
else
x = x - log10(1-2^(-i));
z = z - z*2^(-i);
}
return(z)
}
The range of convergence for this algorithm is determined by the range for
which x can be driven to zero. By inspection of the algorithm this is
determined to be:
B B
Log 10 (1 − 2 − i ) < x < Log 10 (1 + 2 − i )
i =1 i =1
cos a − sin a
R(a) =
sin a cos a
x0
will rotate a vector, , counter-clockwise by a radians in two dimensional
y0
1
space. If this rotation matrix is applied to the initial vector the result
0
cos a
will be a vector with co-ordinates of . It is easily seen that the CORDIC
sin a
method could be applied to calculate the functions Sin(x) and Cos(x) by
1
applying successive rotations to the initial vector and gradually driving
0
the angle a to zero.
A problem arises when an attempt is made to set up the rotation matrix such
that all rotations are accomplished by right shifts. Notice that if ai is chosen
such that cos(ai) = 2-i, the sin(ai) is not necessarily a power of 2. It is not
possible to choose the successive angle rotations, ai, such that both the
cos(ai) and sin(ai) amount to right shifts.
1 − tan(a i )
R(a i ) = cos(a i ) *
tan(a i ) 1
Now the rotation angles ai may be chosen such that tan(ai) = 2-i or rather ai =
tan-1(2-i). The result is the final incremental rotation matrix:
1 − 2 −i
R(a i ) = cos(a i ) * −i
2 1
Where:
ai = tan-1(2-i)
With these choices for the ai, rotation is accomplished using only right shifts.
If the cos(ai) term is neglected in order to avoid the multiplication
operations, the length of the initial vector is increased each time it is
rotated by using right shifts only. This increase may be compensated for by
decreasing the length of the vector prior to rotation. Since the algorithm will
use B successive rotations, all rotations may be compensated for initially
using one collective length correction factor, C. The value of C is found by
grouping all of the ai terms together as follows:
B
C = (∏ cos(tan −1 (2 −i )) −1
i =0
sin(z){
x = 1.6468;
y = 0;
for (i=0; i=<R; i++){
if (z > 0)
x = x - y*2(^-i)
y = y + x*2(^-i)
z = z - arctan(2^(-i))
else
x = x + y*2(^-i)
y = y - x*2^(-i)
z = z + arctan(2^(-i))
}
return(y)
}
cos(z){
x = 1.6468;
y = 0;
for (i=0; i=<R; i++){
if (z > 0)
x = x - y*2(^-i)
y = y + x*2(^-i)
z = z - arctan(2^(-i))
else
x = x + y*2(^-i)
y = y - x*2^(-i)
z = z + arctan(2^(-i))
}
return(x)
}
It may be determined that the previous two algorithms will converge as long as:
B B
− tan −1 (2 −i ) < z < tan −1 (2 −i )
i =0 i =0
or
Since the region of convergence includes both the first and third quadrants,
the algorithms will converge for any z such that -π/2 < z < π/2.
The previously discussed algorithms show that CORDIC based computation methods
require minimal hardware features to implement. These are:
Once the processor and shift register style is chosen, the next choice to be
made involves the data format. Since standard C does not provide a fixed-point
data type, the designer has a lot of freedom in choosing the format of the
data. It is a good idea, however, to choose a format that fits into 16 or 32
bit words. Even though most CORDIC routines are written in assembly language
for speed, 16 or 32 bit words allow data to be passed as either 'int' or 'long
int' data types within higher level C subroutines. The format used in the
following examples uses a 16 bit format with 4 bits to the left of the decimal
point and 12 fractional bits to the right, which is often referred to as 4.12
The constants used are found by multiplying by 212 (4096), rounding, and
converting to hexadecimal. Take the constant e for example:
All of the data tables necessary for CORDIC computing may be built up this way
using a calculator.
Finally with the data format and constant tables established, coding of the
algorithms proceeds in a straightforward manner. The following examples
demonstrate CORDIC algorithms implemented on the 8051, 68HCll and 68332 micro-
controllers. These code fragments were assembled with the INTEL MCS-51 Macro-
Assembler and Motorola Freeware Assemblers and tested on hardware development
systems.
Conclusion
CORDIC algorithms have been around for some time. Volder’s original paper
describing the CORDIC technique for calculating trigonometric functions
appeared in the 1959 IRE transactions. However, the reasons for using CORDIC
algorithms have not changed. The algorithms are efficient in terms of both
computation time and hardware resources. In most micro-controller systems,
especially those performing control functions, these resources are normally
already at a premium. Using CORDIC algorithms may allow a single chip solution
where algorithms using the look-up table method may require a large ROM size or
where power series calculations require a separate co-processor because of the
computation time required.
tan-1x, x 2 + y 2 , and ejΘ. Among the references, Jarvis gives an excellent table
of the functions possible using CORDIC routines.
;----------------------------------------------------------------------------;
; ;
; The Rom table for -log10(1+2^(-i)) and -log(1+2^(-i)). The values are ;
; interlaced and stored with the lower byte first. This format speeds up ;
; the algorithm for the 8051 processor. ;
; ;
;----------------------------------------------------------------------------;
powr10tab: db 02fh, 0fdh, 0d1h, 004h
db 073h, 0feh, 000h, 002h
db 02eh, 0ffh, 0eeh, 000h
db 094h, 0ffh, 073h, 000h
db 0c9h, 0ffh, 038h, 000h
db 0e4h, 0ffh, 01ch, 000h
db 0f2h, 0ffh, 00eh, 000h
db 0f9h, 0ffh, 007h, 000h
db 0fch, 0ffh, 004h, 000h
db 0feh, 0ffh, 002h, 000h
db 0ffh, 0ffh, 001h, 000h
db 000h, 000h, 000h, 000h
end
******************************************************************************
* *
* LOG10.ASM *
* *
* Calculation of log10(x) for the 68HC11 using CORDIC methods. On entry *
* x is on the top of the stack. On exit z = log10(x) is at the top of the *
* stack. All data is 16 bits long using 4.12 format. *
* *
* The stack frame is used as follows: *
* *
* 0,x ==> j - shift register counter *
* 1,x ==> i - outer loop counter *
* 2,x ==> z - output *
* 4,x ==> xs - x shift register *
* 6,x ==> return address *
* 8,x ==> x - input *
* *
* Comment: This routine is meant to reflect the structure of the algorithm *
* without being optimized for speed. As written the algorithm *
* requires a maximum of 2836 clock cycles or approximately 1.4mS. *
* The execution time could be greatly improved by using internal *
* memory to hold variables. *
* *
* Author: Mike Pashea 3-11-2000 *
* *
******************************************************************************
log10 ldy #log10_rom ; y points to the ROM table
ldx #0 ;
pshx ; local space for xs
pshx ; local space for z
pshx ; local space for i and j
tsx ; x points to the top of the stack
ldaa #1 ; initialize the loop counter
staa 1,x ;
log10_1 ldd 8,x ; load the shift register with x
std 4,x ;
ldaa 1,x ;
staa 0,x ; shift counter equal to loop counter
log10_2 asr 4,x ; perform one arithmetic shift right
ror 5,x ;
dec 0,x ;
bne log10_2 ; repeat until all shifts are complete
log10_3 ldd 8,x ; is x greater than 1 ?
subd #4096 ;
ble log10_4 ; no, x should be increased
ldd 8,x ; yes, x should be decreased
subd 4,x ; x = x - x*2^(-i)
std 8,x ;
ldd 2,x ; z = z + -log10(1-2^(-i))
addd 0,y ;
std 2,x ;
bra log10_5 ;
log10_4 ldd 8,x ; x = x + x*2^(-i)
addd 4,x ;
std 8,x ;
ldd 2,x ; z = z - log(1+2^(-i))
subd 2,y ;
std 2,x ;
log10_5 iny ; increment the ROM pointer so that it
iny ; points to the next set of entries in
iny ; the ROM table.
iny ;
inc 1,x ;
lda #12 ; increment loop counter
cmpa 1,x ; have we calculated each bit?
bge log10_1 ; no, loop until we are done
ldd 2,x ; yes, replace x with z
std 8,x ;
pulx ; remove local variables from stack
pulx ;
pulx ;
rts ; return z = log10(x)
******************************************************************************
* *
* The ROM table for log10(1-2^(-i)) and log(1+2^(-i)). The values are *
* interlaced and only the magnitude is stored. The software either will *
* add the positive values and subtract the negative ones. *
* *
******************************************************************************
log10_rom fdb $04d1, $02d1
fdb $0200, $018d
fdb $00ee, $00d2
fdb $0073, $006c
fdb $0038, $0037
fdb $001c, $001c
fdb $000e, $000e
fdb $0007, $0007
fdb $0004, $0004
fdb $0002, $0002
fdb $0001, $0001
fdb $0000, $0000
******************************************************************************
* *
* The ROM table for arctan(2^(-i)). All constants are in 8.24 format. *
* *
******************************************************************************
atantab dc.l 13176795
dc.l 7778716
dc.l 4110059
dc.l 2086331
dc.l 1047214
dc.l 524117
dc.l 262123
dc.l 131069
dc.l 65536
dc.l 32768
dc.l 16384
dc.l 8192
dc.l 4096
dc.l 2048
dc.l 1024
dc.l 512
dc.l 256
dc.l 128
dc.l 64
dc.l 32
dc.l 16
dc.l 8
dc.l 4
dc.l 2
dc.l 1
References