On The Optimum Constructions of Composite Field For The AES Algorithm
On The Optimum Constructions of Composite Field For The AES Algorithm
On The Optimum Constructions of Composite Field For The AES Algorithm
Abstract—In the hardware implementations of the Advanced inversion in the SubBytes/InvSubBytes transformation of the
Encryption Standard (AES) algorithm, employing composite field AES algorithm. As a result, deep subpipelining is enabled, and
arithmetic not only reduces the complexity but also enables deep hardware complexity is reduced.
subpipelining such that higher speed can be achieved. In addition,
Different construction schemes for composite fields are pro-
it is more efficient to employ composite field arithmetic only in the
SubBytes transformation of the AES algorithm. Composite fields posed for the AES algorithm in [7], [8], and [11]. In the design
can be constructed by using different irreducible polynomials. in [7], is decomposed into , and composite
Nevertheless, how the different constructions affect the complexity field arithmetic is applied to all the transformations in the AES
of the composite implementation of the SubBytes has not been an- algorithm. The optimum construction scheme for is
alyzed in prior works. This brief presents 16 ways to construct the selected based on minimizing the total gate count in the imple-
composite field (((22 )2 )2 ) for the AES algorithm. Analytical
mentation of all transformations. However, it is more efficient to
results are provided for the effects of the irreducible polynomial
coefficients on the complexity of each involved subfield operation. apply composite field arithmetic only in the computation of the
In addition, for each construction, there exist eight isomorphic multiplicative inversion in the SubBytes and InvSubBytes trans-
mappings that map the elements in (28 ) to those in composite formations [2]. In this case, the construction scheme selected in
fields. The complexities of these mappings vary. An efficient algo- [7] is no longer optimum. The schemes proposed in [8] and [11]
rithm is proposed in this brief to find all isomorphic mappings. apply composite field arithmetic only to the multiplicative in-
Based on the complexities of both the subfield operations and the version. In [8], is decomposed into , while
isomorphic mappings, the optimum constructions of the composite
field for the AES algorithm are selected to minimize gate count in [11], is decomposed into . Neverthe-
and critical path. less, each of them proposed only one possible way to construct
the composite field. There exist other construction schemes with
Index Terms—Advanced Encryption Standard (AES) algorithm,
smaller gate counts and shorter critical paths.
composite field, isomorphic mapping, multiplicative inversion.
Different irreducible polynomials can be used to construct the
composite fields of the same order. This brief presents 16 ways
I. INTRODUCTION to construct . Using composite field arithmetic,
the complicated multiplicative inversion in is mapped
RYPTOGRAPHY plays an important role in the security
C of data transmission. The development of computing
technology imposes stronger requirements on the cryptography
to operations in subfields. This brief provides the analytical re-
sults of how the coefficients in the irreducible polynomials af-
fect the complexities of the subfield operations. In addition, for
schemes. The Data Encryption Standard (DES) has been the each construction scheme, there exist eight isomorphic map-
U.S. government standard since 1977. However, now, it can pings with various complexities to map the elements between
be cracked quickly and inexpensively. In 2000, the Advanced and . An efficient algorithm is proposed
Encryption Standard (AES) [1] replaced the DES to meet the in this brief to find all the isomorphic mappings. Moreover, the
ever-increasing requirements for security. lowest mapping complexity is provided for each proposed com-
The AES algorithm has broad applications, such as smart posite field construction scheme. Based on the complexities of
cards and cell phones, WWW servers and automated teller both the subfield operations and the isomorphic mappings, the
machines, and digital video recorders. Numerous architectures optimum constructions of the composite field for
have been proposed for the hardware implementations of the the AES algorithm are proposed. Other composite field con-
AES algorithm [2]–[11]. Among these architectures, the design struction optimization approaches have been published recently
in [2] can achieve the highest speed while it is more efficient [12], [13]. However, the approach in [12] is optimized based
than the prior designs. The key idea in [2] is to employ com- only on the complexity of isomorphic mappings. The approach
posite field arithmetic in the computation of the multiplicative in [13] optimizes for overall area requirement. Nevertheless, the
critical path issue is ignored in the optimization process.
The structure of this brief is as follows. In Section II, the ar-
Manuscript received July 12, 2005; revised April 14, 2006. This work was chitecture for the implementation of SubBytes using composite
supported by the Army Research Office under Grant W911NF-04-1-0272. This field arithmetic is introduced. Section III provides different con-
paper was recommended by Associate Editor C.-T. Lin.
X. Zhang is with the Department of Electrical Engineering and Computer struction schemes for the composite field . Sec-
Science, Case Western Reserve University, Cleveland, OH 44106-7071 USA tion IV discusses how the coefficients of the field polynomials
(e-mail: xinmiao.zhang@case.edu). affect the complexity of each block in the SubBytes implemen-
K. K. Parhi is with the Department of Electrical and Computer Engineering,
University of Minnesota, Minneapolis, MN 55455 USA (e-mail: parhi@ece.
tation. An efficient scheme is presented in Section V to find all
umn.edu). the isomorphic mappings for each possible construction of the
Digital Object Identifier 10.1109/TCSII.2006.882217 composite field. The lowest achievable mapping complexity for
1057-7130/$20.00 © 2006 IEEE
1154 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 53, NO. 10, OCTOBER 2006
2). (e)
Fig. 2. Implementations of individual blocks. (a) Multiplier in GF ((2 ) ). (b) Multiplier in GF (2 ). (c) Squarer in GF (2 ). (d) Constant multiplier (
2
Constant multiplier ( ).
respectively. It can be derived that there are eight possible multiplier does not change with or . Accordingly, the com-
values of that make irreducible over plexity of the multiplier is the same for the two pos-
constructed by using each of and . These sible values of .
values of are Compared to a general multiplier, the implementation of a
squarer can be simplified. In the case , taking the 4
bits of as , it can be derived that
each bit in can be computed as
(3)
Algorithm 1
initialization: t = 1, stop = 0
Therefore, two values need to be computed for the constant mul- flag(i) = 0 for i = 1; 2; . . . ; 2 01
tiplication by , i.e.,
while stop == 0
(4) {! = dectobin(t; q )
(5) P (! )
if P (! ) == 0
One way to simplify the computations in the above two equa-
tions is to make or zero such that the terms consisting mapping found, output = !
of the multiplications with or can be eliminated from stop = 1
the computation. Alternatively, if , the terms and else
can be cancelled out from (4). From the discussions in Sec-
tion III, can only be or . Therefore, the values index(j ) = bintodec(! ), for j = 0 ; 1; 2; . . . ; q 01
of , which minimize the complexity of , can be , flag(index(j )) = 1, for j = 0; 1; 2; . . . ; q 01
, , or . Table I lists the gate counts for find the minimum integer l > t, such that flag(l) = 0
the implementations of for all possible combinations of t = l
and . The critical path for each implementation has two XOR
gates. It can be observed from Table I that the combinations of }
and , which minimize the cost of the constant multiplication
In Algorithm 1, means first convert the in-
by , are those listed in Table II. The results in Table II agree
teger to a -bit binary number, and then take these bits as the
with the analytical results.
standard basis representation for . Similarly,
stands for taking the standard basis representation of as an
V. OPTIMUM ISOMORPHIC MAPPINGS -bit binary number, and then converting this number into an
For a fixed set of irreducible polynomials in (1), there exist integer as the value of . It may be noted that the computation of
multiple isomorphic mappings between and is carried out according to the field operations specified
. The complexities of these mappings vary. In this in the composite field. The basic idea of this algorithm is to test
section, the lowest achievable complexity of the isomorphic if an element of the composite field is a root of . If not,
mappings for each combination of and is provided. then none of the other elements in the same conjugacy class as
An algorithm is proposed in [14, Ch. 2.2] to find the isomor- , namely , is a root of . The next
phic mapping matrices when the involved field polynomials are element to be tested is selected excluding the elements in the
primitive. However, the irreducible polynomial conjugacy classes of all tested elements. It can be derived that,
specified for the AES algorithm is not primi- on average, checkings are needed to find isomor-
tive. Therefore, the algorithm in [14] cannot be applied directly. phic mappings for a field of order .
ZHANG AND PARHI: ON THE OPTIMUM CONSTRUCTIONS OF COMPOSITE FIELD FOR THE AES ALGORITHM 1157
VI. SUMMARY
In this brief, the optimum constructions of the composite field
for the AES algorithm are presented. How the coefficients of
the field polynomials affect each block in the composite field
implementation of the SubBytes transformation is analyzed. In
addition, an efficient algorithm is proposed to find isomorphic
mappings when the involved field polynomials are not primi-
The computed by Algorithm 1 is not the only element tive. The optimum constructions are selected by considering the
can be mapped to. can also be mapped to the other elements complexities of both the involved subfield operations and the
in the same conjugacy class as . It can be computed that for isomorphic mappings. Future work will address composite field
each of the combinations of and , there are eight isomorphic constructions using irreducible polynomials in other forms.
mappings. The optimum isomorphic mapping can be selected
based on minimizing gate count. Table III shows the complexity REFERENCES
[1] Advanced Encryption Standard (AES), FIPS PUB 197, Nov. 26, 2001,
of the optimum isomorphic mapping and its inverse for each Federal Information Processing Standards publication 197.
combination of and . [2] X. Zhang and K. K Parhi, “High-speed VLSI architecture for the AES
The optimum constructions of the composite field for the algorithm,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 12,
AES algorithm can be selected by taking the complexities of no. 9, pp. 957–967, Sep. 2004.
[3] K. U. Jarvinen, M. T. Tommiska, and J. O. Skytta, “A fully pipelined
the involved subfield operations and isomorphic mappings into memoryless 17.8 Gbps AES-128 encryptor,” in Proc. Int. Symp. FPGA,
account. From the discussions in Section IV and Section V, it Monterey, CA, Feb. 2003, pp. 207–215.
can be concluded that the implementation of the multiplicative [4] G. P. Saggese, A. Mazzeo, N. Mazocca, and A. G. M. Strollo,
“An FPGA based performance analysis of the unrolling, tiling and
inversion and the affine transformation in SubBytes has the least pipelining of the AES algorithm,” in Proc. FPL, Portugal, Sep. 2003,
gate count and the shortest critical path when is pp. 292–302.
constructed by using either , or [5] F. Standaert, G. Rouvroy, J. Quisquater, and J. Legat, “Efficient im-
, . The lowest complexity for isomorphic plementation of Rijndael encryption in reconfigurable hardware: Im-
provements and design tradeoffs,” in Proc. CHES, Cologne, Germany,
mapping and inverse can be achieved for these two construc- Sep. 2003, pp. 334–350.
tions when the root of is mapped to [6] X. Zhang and K. K. Parhi, “Implementation approaches for the ad-
and , respectively. vanced encryption standard algorithm,” IEEE Circuits Syst. Mag., vol.
2, no. 4, pp. 24–46, Fourth Quarter 2002.
Table IV shows some comparison results with prior works. [7] A. Rudra, P. K. Dubey, C. S. Jutla, V. Kumar, J. R. Rao, and P. Rohatgi,
Using the proposed optimum construction of composite field, “Efficient implementation of Rijndael encryption with composite field
the SubBytes can be implemented by 120 XOR gates and 35 AND arithmetic,” in Proc. CHES, Paris, France, May 2001, pp. 171–184.
gates with 19 XOR gates and 4 AND gates in the critical path. In [8] A. Satoh, S. Morioka, K. Takano, and S. Munetoh, “A compact Ri-
jndael hardware architecture with S-box optimization,” in Proc. ASI-
[12], the complexity of the isomorphic mapping is measured by ACRYPT, Gold Coast, Australia, Dec. 2000, pp. 239–254.
the number of 1s in the matrix. This is an incorrect approach [9] M. McLoone and J. V. McCanny, “Rijndael FPGA implementation uti-
when substructure sharing is employed. The matrix with the lizing look-up tables,” in Proc. IEEE Workshop Signal Process. Syst.,
Sep. 2001, pp. 349–360.
largest number of 1s may have more terms that can be shared, [10] H. Kuo and I. Verbauwhede, “Architectural optimization for a 1.82
which may leads to a smaller total gate number. Considering Gbits/sec VLSI implementation of the AES Rijndael algorithm,” in
substructure sharing, the optimum approach proposed in [12] Proc. CHES, Paris, France, May 2001, pp. 51–64.
is only three gates less than that in [8] with two gates less in [11] J. Wolkerstorfer, E. Oswald, and M. Lamberger, “An ASIC implemen-
tation of the AES S-boxes,” in Proc. RSA Conf., San Jose, CA, Feb.
the critical path. The approach in [13] achieves minimum gate 2002, pp. 67–78.
count by using normal basis for field element representation and [12] N. Mentens, L. Batina, B. Preneel, and I. Verbauwhede, “A system-
sharing terms between multipliers with a common input. How- atic evaluation of compact hardware implementations for the Rijndael
S-box,” in Proc. Topics Cryptology—CT-RSA, San Francisco, CA, Feb.
ever, the critical path of this approach is much longer than that of 2005, pp. 323–333.
the construction proposed in this brief. In addition, further area [13] D. Canright, “A very compact S-box for AES,” in Proc. Cryptographic
reduction can be achieved by introducing NOR gates and NOT Hardware and Embedded Syst., Edinburgh, U.K., Sep. 2005, pp.
gates in the design [13]. The numbers in Table IV are based on 441–455.
[14] C. Paar, “Efficient VLSI architecture for bit-parallel computations in
a theoretical analysis. These numbers are not the result of syn- Galois field,” Ph.D. dissertation, Inst. Exp. Math.,, Univ. Essen, Essen,
thesizing by using any cell library. Germany, 1994.