A VHDL Scalable-Encryption-Algorithm
A VHDL Scalable-Encryption-Algorithm
SEA is a scalable encryption algorithm targeted for small embedded applications. It was initially designed for software implementations in controllers, smart cards, or processors. In this letter, we investigate its performances in recent fieldprogrammable gate array (FPGA) devices. For this purpose, a loop architecture of the block cipher is presented. Beyond its low cost performances, a significant advantage of the proposed architecture is its full flexibility for any parameter of the scalable encryption algorithm, taking advantage of generic VHDL coding. The letter also carefully describes the implementation details allowing us to keep small area requirements. Finally, a comparative performance discussion of SEA with the Advanced Encryption Standard Rijndael and ICEBERG (a cipher purposed for efficient FPGA implementations) is proposed. It illustrates the interest of platform/context-oriented block cipher design and, as far as SEA is concerned, its low area requirements and reasonable efficiency. Scalable encryption algorithm (SEA) is a parametric block cipher for resource constrained systems (e.g., sensor networks, RFIDs) that has been introduced in [1]. It was initially designed as a low-cost encryption/ authentication routine (i.e., with small code size and memory) targeted for processors with a limited instruction set (i.e., AND, OR, XOR gates, word rotation, and modular addition). Additionally and contrary to most recent block ciphers (e.g., the DES [2] and AES Rijndael [3], [4]), the algorithm takes the plaintext, key, and the bus sizes as parameters and, therefore, can be straightforwardly adapted to various implementation contexts and/or security requirements. Compared to older solutions for low-cost encryption like tiny encryption algorithm (TEA) [5] or Yuvals proposal [6], SEA also benefits from a stronger security analysis, derived from recent advances in block cipher design/cryptanalysis. In practice, SEA has been proven to be an efficient solution for embedded software applications using microcontrollers, but its hardware performances have not yet been investigated. Consequently, and as a first step towards hardware performance analysis, this letter explores the features of a low-cost field-programmable gate array (FPGA) encryption/ decryption core for SEA. In addition to the performance evaluation, we show that the algorithms scalability can be turned into a fully generic VHDL design, so that any text, key, and bus size can be straightforwardly reimplemented without any modification of the hardware description language, with standard synthesis and implementation tools.
CONTENTS
CHAPTER 1: Introduction to VLSI Introduction VLSI Design Style 1.3 VLSI Design Flow 1.4 VLSI Features CHAPTER 2: Introduction to VHDL 2.1 Introduction 2.2 Capabilities 2.3 Abstraction levels of VHDL 2.4 Basic Terminology 2.5 Modeling Techniques for VHDL 2.6 Process Statements 2.7 Conditional Statements 2.8 Active HDL Overview 2.9 Macro language 2.10 Compilation 2.11 Simulation 2.12 X Linix CHAPTER 3: Introduction to SEA 3.1 Specifications 3.2 Design properties 3.3 Overall Structure 3.4 Security Analysis 3.5 Performance Analysis CHAPTER 4: An Exposition Of SEA 4.1 Overview of SEA 38 CHAPTER 5: SEA Architecture 39 5.1 Key Generation 5.2 Encryption 5.3 Decryption Appendix-I Appendix-II Appendix- III Appendix-IV Advantages 1.1 1.2 9 9 10 11 11 12 12 13 13 14 17 18 19 21 22 23 23 24 26 27 30 31 31 35 37
Conclusion 81
The first digital circuit was designed by using electronic components like vacuum tubes and transistors. Later Integrated Circuits (ICs) were invented, where a designer can be able to place digital circuits on a chip consists of less than 10 gates for an IC called SSI (Small Scale Integration) scale. With the advent of new fabrication techniques designer can place more than 100 gates on an IC called MSI (Medium Scale Integration). Using design at this level, one can create digital sub blocks (adders, multiplexes, counters, registers, and etc.) on an IC. This level is LSI (Large Scale Integration), using this scale of integration people succeeded to make digital subsystems (Microprocessor, I/O peripheral devices and etc.) on a chip. At this point design process started getting very complicated. i.e., manually conversion from schematic level to gate level or gate level to layout level was becoming somewhat lengthy process and verifying the functionality of digital circuits at various levels became critical. This created new challenges to digital designers as well as circuit designers. Designers felt need to automate these processes. In this process, Rapid advances in Software Technology and development of new higher level programming languages taken place. People could able to develop CAD/CAE (Computer Aided Design/Computer Aided Engineering) tools, for design electronics circuits with assistance of software programs. Functional verification and Logic verification of design can be done using CAD simulation tools with greater efficiency. It became very easy to a designer to verify functionality of design at various levels. With advent of new technology, i.e., CMOS (Complementary Metal Oxide Semiconductor) process technology. One can fabricate a chip contains more than Million of gates. At this point design process still became critical, because of manual converting the design from one level to other. Using latest CAD tools could solve the problem. Existence of logic synthesis tools design engineer can easily translate to higher-level design description to lower levels. This way of designing (using CAD tools) is certainly a revolution in electronic industry. This may be leading to development of sophisticated electronic products for both consumer as well as business. Designing Systems using Hardware always gives best results when compared to software (like Speed Reliability, performance and etc.,) Using CMOS VLSI Design methodology designer could design and fabricate ICs without spending much time when compared to traditional way of designing.
Functional Behavioral Library simulation Lay Out FabricationSimulation Logic Layout Level Net BehavioralsSimulation Behavioral Automatic Gate RTL Specificatio Constraint Logic Management Description Synthesis P&R Synthesis
l D n i e s s t c r i p t i o n
The micron technology can be classified into 4 categories, Evolving from micron technology and extending up to VDSM.
: The technology below 1um is known as the Submicron technology. It generally ranges up to 0.36 m. : The technology extending up to 0.18 m is
VDSM(Very Deep Sub Micron technology): The presently used technology is VDSM. It ranges up to 0.09um.
1.4 FEATURES:
5
Waveform generation language VHDL This language not only defines the syntax but also defines very clear simulation semantics for each language construct. Therefore, models written in this language can be verified using a VHDL simulator. This subset is usually sufficient to model most applications .The complete language, however, has sufficient power to capture the descriptions of the most complex chips to a complete electronic system. HISTORY: The requirements for the language were first generated in 1988 under the VHSIC chips for the department of Defence (DOD). Reprocurement and reuse was also a big issue. Thus, a need for a standardized hardware description language for the design,
8
documentation, and verification of the digital systems was generated. The IEEE in the December 1987 standardized VHDL language; this version of the language is known as the IEEE STD 1076-1987. The official language description appears in the IEEE standard VHDL language Reference manual, available from IEEE. The language has also been recognized as an American National Standards Institute (ANSI) standard. According to IEEE rules, an IEEE standard has to be reballoted every 5 years so that it may remain a standard so that it may remain a standard. Consequently, the language was upgraded with new features, the syntax of many constructs was made more uniform, and many ambiguities present in the 1987 version of the language were resolved. This new version of the language is known as the IEEE STD 1076-1993.
2.2 CAPABILITIES: The following are the major capabilities that the language provides along with the features that the language provides along with the features that differentiate it from other hardware languages. The language can be used as exchange medium between chip vendors and CAD tool users. Different chip vendors can provide VHDL descriptions of their components to system designers. The language can be used as a communication medium between different CAD and CAE tools The language supports hierarchy; that is a digital can be modeled as asset of interconnected components; each component, in turn, can be modeled as a set of interconnected subcomponents. The language supports flexible design methodologies: top-down, bottom-up, or mixed. It supports both synchronous and asynchronous timing models. Various digital modeling techniques, such as finite state machine descriptions, and Boolean equations, can be modeled using the language. The language is publicly available, human-readable, and machine-readable. The language supports three basic different styles: Structural, Dataflow, and behavioral. It supports a wide range of abstraction levels ranging from abstract behavioral descriptions to very precise gate-level descriptions. Arbitrarily large designs can be modeled using the language, and there are no limitations imposed by the language on the size of the design.
9
2.3 HARDWARE ABSTRACTION: VHDL is used to describe a model for a digital hardware device. This model specifies the external view of the device and one or more internal views. The internal view of the device specifies functionality or structure, while the external view specifies the interface of the device through which it communicates with the other modules in the environment. In VHDL each device model is treated as a distinct representation of a unique device, called an Entity. The Entity is thus a hardware abstraction of the actual hardware device. Each Entity is described using one model, which contains one external view and one or more internal views.
10
Once an entity has been modeled, it needs to be validated by a VHDL system. A typical VHDL system consists of an analyzer and a simulator. The analyzer reads in one or more design units contained in a single file and compiles them into a design library after validating the syntax and performing some static checks. The language is case insensitive; that is lowercase and uppercase characters are treated alike the Language is also free format comments are specified in the language by preceding the text with two Consecutive dashes (- -). Entity Declaration: The entity declaration specifies the name of entity being modeled and lists the set of inter face ports. Ports are signals through which entity communicates with other models in its external environment. EXAMPLE: Entity declaration for the half adder circuit is Entity half adder is Port (A, B: in Bit; sum, carry: out Bit); End half adder; The entity called half adder has two input ports, A and B and two out put ports sum and carry Bit is predefined type of the language. Architecture Body: An architecture body using any of the following modeling styles specifies the internal details of an entity. 1. As a set of interconnected components (to represent structure) 2. As a set of concurrent assignment statements (to represent data flow) 3. As a set of sequential assignment statements (to represent behavior) 4. As any combination of the above three.
2.5 Structural style of modeling: In this one an entity is described as a set of interconnected components. Such a model for the HALF_ADDER entity, is described in a n architecture body Architecture ha of ha is Component Xor2 Port (X, Y: in BIT; Z:out BIT); End component; Component And2 Port (L, M: in BIT; N:outBIT); End component; Begin X1: Xor2portmap (A, B, SUM)
11
A1: AND2portmap (A, B, CARRY); End ha; The name of the architecture body is ha .the entity declaration for half adder specifies the interface ports for this architecture body. The architecture body is composed of two parts: the declaration part and the statement part. Two component declarations are present in the declarative part of the architecture body. The declared components are instantiated in the statement part of the architecture body using component instantiation. The signals in the port map of a component instantiation and the port signals in the component declaration are associated by the position.
DATAFLOW STYLE OF MODELING: In this modeling style, the flow of data through the entity is expressed primarily using concurrent signal assignment statements. The data flow model for the half adder is described using two concurrent signal assignment statements .In a signal assignment statement, the symbol <=implies an assignment of a value to a signal. BEHAVIORAL STYLE OF MODELING: The behavioral style of modeling specifies the behavior of an entity as a set of statements that are executed sequentially in the specific order. These sets of sequential statements, which are specified inside a process statement, do not explicitly specify the structure of the entity but merely its functionality. A process statement is a concurrent statement that can appear with in an architecture body. MIXED STYLE OF MODELING: It is possible to mix the three modeling styles in a single architecture body. That is, within an architecture body, we could use component instantiation statements, concurrent signal assignment statements and process statements. MODEL ANALYSIS: Once an entity is declared in VHDL, it can be validated using analyzer and a simulator that are apart of a VHDL system. The first step in the validation process is analysis. The analyzer takes a file that contains one or more design units and compile s them into an intermediate form. The generated intermediate form is stored in a specific design library that has been designated as the working library. There is a design library with the logic name STD predefined by the VHDL language environment. This library contains two packages: STANDARD and TEXTIO. The STANDARD package contains declarations for all the predefined types of the language .The TexTIO package contains procedures and functions that are necessary for supporting formatted text read and write operations. There also exists an IEEE standard package called STD_LOGIC_1164,and contains its associated sub types; overloaded
12
operator functions, and other useful utilities. This standard is called the IEEE STD 1164 1993. SIMULATION: For a hierarchical entity to be simulated, all of its lowest level components must be described at the behavioral level. A simulation can be performed on either one of the following: 1. An entity declaration and an architecture body pair.
2. A configuration
Preceding the actual simulation are two major steps: 1. Elaboration phase: IN this phase, the hierarchy of the entity is expanded and linked, components are bound to entities in a library, and the toplevel entity is built as a network of behavioral models that is ready to be simulated. 2. Initialization phase: Driving and effective values for all explicitly declared signals are computed, implicit signals are assigned values, processes are executed once until they suspend, and simulation time is set to 0ns. Simulation commences by advancing time to that of the next event. Values that are assigned to signals at this time are assigned. If the value of a signal changes, and if that signal is present in the sensitivity list of a process, the process is executed until it suspends. Simulation stops when an assertion occurs, depending on the implementation of the VHDL system or when the maximum time as defined by the language is reached. Entity Declaration: An entity declaration describes the external interface of the entity. It specifies the name of the entity, the names of the interface ports, their mode and the type of ports .The syntax for entity declaration is: Entity entity _name is [generic (list of generics and their types);] [port (list of interface-port-names-and their types );] [entity item declarations] [begin entity statements] end [entity][entity name]; The entity name is the name of the entity, and the interface ports are the signals through which entity passes the information to and from its external environment. Each interface port can have one of the following modes: 1. in: The value of an input port can only read with in the entity model . 2. out: The value of an out put port can only be updated within the entity model.
13
3. inout: The value of a bi directional port can be read and updated within the entity
model. 4. buffer: The value of a buffer port can be read and updated within the entity model .It cannot have more than one source. Declarations that are placed in the entity are common to all the design units that are associated with that entity declaration. ARCHITECTURE BODY: An architecture body describes the internal view of an entity. It describes the functionality of the structure of the entity.
Architecture <architecture name> of< entity name> is Begin Concurrent statements; Process statements; Block statements; Concurrent signal assignment-statement; Component instantiation-statement; Generate statement; End [architecture] [architecture name]; The concurrent statements describe the internal composition of the entity. All concurrent statements are executed in parallel. The internal composition of an entity can be expressed in terms of structure, dataflow and sequential behavior. Here we describe an entity by using the behavioral model. A process statement, which is a concurrent statement, is the primary mechanism used to describe the functionality of an entity in this modeling style. 2.6 PROCESS STATEMENT: A process statement contains sequential statements that describe the functionality of a portion of an entity in sequential terms. The syntax for the process statement is: [Process-label:] process [(sensitivity-list)] [is] begin sequential statements; variable-assignment-statement signal assignment-statement wait statement if-statement case-statement loop-statement null-statement exit-statement next-statement assertion-statement
14
report-statement procedure-call-statement return end process [process label]; A set of signals to which the process is sensitive is defined by the sensitivity list. In other words, each time an event occurs on any of the signals in the sensitivity list, the sequential statements with in the process are executed in a sequential order, that is in the order in which they appear. The process then suspends after executing the last sequential statement and waits for another event to occur on a signal in the sensitivity list.
VARIABLE ASSIGNMENT STAEMENT: Variables can be declared and used inside a process statement. A variable is assigned a value using the variable assignment statement that typically has the form Variable-object: = expression; The expression is evaluated when the statement is executed, and the computed value is assigned to the variable object instantaneously, that is, at the concurrent simulation time. A variable can be declared outside of a process or subprogram. Such a variable can be read and updated by more than one process. These variables are called shared variables. SIGNAL ASSIGNMENT STATEMENT: Signals are assigned values using a signal assignment statement. The simplest form of a signal assignment statement is: Signal-object <= expression [after a delay value]; A signal assignment statement can appear within a process or outside of a process. If it occurs outside of a process, it is considered to be a concurrent signal assignment statement. When a signal assignment statement appears with in a process, it is considered to be a sequential signal assignment statement and is executed in sequences with respect to the other statements which appear with in the process. 2.7 CONDITIONAL STATEMENTS: IF STATEMENT: An if statement selects a sequence of statements for execution of statements for execution based on the value of a condition .the condition .The condition can be any expression that evaluates to a Boolean value. The general form of an if statement is:
15
If Boolean expression then Sequential statements {elsif Boolean-expression then Sequential-statements} [else sequential statements] end if; The if statement is executed by checking each condition sequentially until the first true condition is found; the set of sequential statements associated with this condition is executed. An if statement is also a sequential statement.
CASE STATEMENT: The format of a case statement is: Case expression is When choices =>sequential statements When choices =>sequential statements End case; The case statement selects one of the branches for the execution based on the value of the expression. The expression value must be of a discrete type or one-dimensional array type. Choices may be expressed as single values, as a range of values by choosing others. The other clause can be used as a choice to cover the catch-all values and, if present, must be the last branch in the case statement LOOP STATEMENTS: A loop statement is used to iterate through a set of sequential statements the syntax for loop statement is: [Loop-label:] iteration-scheme loop Sequential-statements End loop [loop label];
16
supports the verification and testing of hardware designs, the communication of hardware design and test verification data, the maintenance, modification and procurement of hardware system.
2.10 Compilation: Compilation is a process of analysis of a source file. Analyzed design units contained within the file are placed into the working library in a format understandable for the simulator. In Active-HDL, a source file can be on of the following: VHDL file Verilog file EDIF net list file State diagram file (.asf) Block diagram file (.bde) (.vhd) (.v)
In the case of a block or state diagram file, the compiler analyzes the intermediate VHDL, Verilog, or EDIF file containing HDL code (or net list) generated from the diagram. A net list is a set of statements that specifies the elements of a circuit (for example, transistors or gates) and their interconnection. Active-HDL provides three compilers, respectively for VHDL, Verilog, and EDIF. When you choose a menu command or toolbar button for compilation, Active-HDL automatically employs the compiler appropriate for the type of the source file being compiled. 2.11 Simulation: The purpose of simulation is to verify that the circuit works as desired. The Active-HDL simulator provides two simulation engines. Event-Driven Simulation Cycle-Based Simulation The simulator supports hybrid simulation some portions of a design can be simulated in the event-driven kernel while the others in the cycle-based kernel. Cycle-based simulation is significantly faster than event-driven.
2.12 XILINX:
Integrated Software Environment (ISE) is the Xilinx design software suite. This overview explains the general progression of a design through ISE from start to finish.
19
ISE enables you to start your design with any of a number of different source types, including: HDL (VHDL, Verilog HDL, ABEL) Schematic design files EDIF NGC/NGO State Machines IP Cores
From your source files, ISE enables you to quickly verify the functionality of these sources using the integrated simulation capabilities, including ModelSim Xilinx Edition and the HDL Bencher test bench generator. HDL sources may be synthesized using the Xilinx Synthesis Technology (XST) as well as partner synthesis engines used standalone or integrated into ISE. The Xilinx implementation tools continue the process into a placed and routed FPGA or fitted CPLD, and finally produce a bit stream for your device configuration. Design Entry: ISE Text Editor - The ISE Text Editor is provided in ISE for entering design code and viewing reports. Schematic Editor - The Engineering Capture System (ECS) is a graphical user interface (GUI) that allows you to create, view, and edit schematics and symbols for the Design Entry step of the Xilinx design flow. CORE Generator - The CORE Generator System is a design tool that delivers parameterized cores optimized for Xilinx FPGAs ranging in complexity from simple arithmetic operators such as adders, to system-level building blocks such as filters, transforms, FIFOs, and memories. Constraints Editor - The Constraints Editor allows you to create and modify the most commonly used timing constraints. PACE - The Pin out and Area Constraints Editor (PACE) allows you to view and edit I/O, Global logic, and Area Group constraints. State CAD State Machine Editor - State CAD allows you to specify states, transitions, and actions in a graphical editor. The state machine will be created in HDL.
Implementation: Translate - The Translate process runs NGDBuild to merge all of the input net lists as well as design constraint information into a Xilinx database file. Map - The Map program maps a logical design to a Xilinx FPGA. Place and Route (PAR) - The PAR program accepts the mapped design, places and routes the FPGA, and produces output for the bit stream generator.
20
Floor planner - The Floor planner allows you to view a graphical representation of the FPGA, and to view and modify the placed design. FPGA Editor - The FPGA Editor allows you view and modify the physical implementation, including routing. Timing Analyzer - The Timing Analyzer provides a way to perform static timing analysis on FPGA and CPLD designs. With Timing Analyzer, analysis can be performed immediately after mapping, placing or routing an FPGA design, and after fitting and routing a CPLD design. Fit (CPLD only) - The CPLDFit process maps a net list(s) into specified devices and creates the JEDEC programming file. Chip Viewer (CPLD only) - The Chip Viewer tool provides a graphical view of the inputs and outputs, macro cell details, equations, and pin assignments.
Device Download and Program File Formatting BitGen - The BitGen program receives the placed and routed design and produces a bit stream for Xilinx device configuration. iMPACT - The iMPACT tool generates various programming file formats, and subsequently allows you to configure your device. XPower - XPower enables you to interactively and automatically analyze power consumption for Xilinx FPGA and CPLD devices. Integration with ChipScope Pro.
CH 3: Introduction to SEA
Most present symmetric encryption algorithms result from a tradeoff between implementation cost and resulting performances. In addition, they generally aim to be implemented efficiently on a large variety of platforms. In this paper, we take an opposite approach and consider a context where we have very limited processing resources and throughput requirements. For this purpose, we propose low-cost encryption routines (i.e. with small code size and memory) targeted for processors with a limited instruction set
21
(i.e. AND, OR, XOR gates, word rotation and modular addition). The proposed design is parametric in the text, key and processor size, allows efficient combination of encryption/decryption, on-the-fly key derivation and its security against a number of recent cryptanalytic techniques is discussed. Target applications for such routines include any context requiring low-cost encryption and/or authentication. In this paper, we consequently consider a general context where we have very limited processing resources (e.g. a small processor) and throughput requirements. It yields design criteria such as: low memory requirements, small code size, limited instruction set. In addition, we propose the flexibility as another unusual design principle. SEAn,b is parametric in the text, key and processor size. Such an approach was motivated by the fact that many algorithms behave differently on different platforms (e.g. 8-bit or 32-bit processors). In opposition, SEAn,b allows to obtain a small encryption routine targeted to any given processor, the security of the cipher being adapted in function of its key size. Beyond these general guidelines, alternative features were wanted, including the efficient combination of encryption and decryption or the ability to derive keys on the fly. Those goals are particularly relevant in contexts where the same constrained device has to perform encryption and decryption operations (e.g. authentication). Finally, the simplicity of SEAn,b makes its implementation straightforward. Embedded applications such as building infrastructures present a significant opportunity and challenge for such new cryptosystems. For example, introducing programmability into the configuration of lights and switches, thermostats and air handlers, promises to improve the cost of construction, flexibility in occupancy, and energy efficiency of buildings. But meeting this demand on a scale compatible with the economics of the construction industry is going to require secure lightweight implementations of peer-to-peer networks in resource-constrained systems. The Internet-0 approach to end-to-end modulation for interdevice internetworking is typically appropriate in this limit [20]. SEAn,b constitutes a suitable solution for low-cost encryption/authentication within such networks. RFIDs or any power/space-limited applications are similarly targeted.
3.1 Specifications:
Parameters and Definitions: SEAn,b operates on various text, key and word sizes. It is based on a Feistel structure with a variable number of rounds, and is defined with respect to the following parameters: n: plaintext size, key size. b: processor (or word) size. nb = n 2b : number of words per Feistel branch. --nr: number of block cipher rounds.
22
As only constraint, it is required that n is a multiple of 6b. For example, using an 8-bit processor, we can derive 48, 96, 144, . . . -bit block ciphers, respectively denoted as SEA48,8, SEA96,8, SEA144,8, ... Let x be a n2 -bit vector. In the following, we will consider two representations: Bit representation: xb = x(n2 1) x(n2 2) . . . , x(2) x(1) x(0). --Word representation: xW = xnb1 xnb2 . . . x2 x1 x0. Basic Operations Due to its simplicity constraints, SEAn,b is based on a limited number of elementary operations (selected for their availability in any processing device) denoted as follows: (1) bitwise XOR , (2) substitution box S, (3) word (left) rotation R and inverse word rotation R1, (4) bit rotation r, (5) addition mod 2b _. These operations are formally defined as follows: 1. Bitwise XOR: The bitwise XOR is defined on n2-bit vectors: : Zn2 2 Zn2 2 Zn2 2 : x, y z = x y z(i) = x(i) y(i), 0 i n2 1 2. Substitution Box S: SEAn,b uses the following 3-bit substitution table: ST := {0, 5, 6, 7, 4, 3, 1, 2}, in C-like notation. For efficiency purposes, it is applied bitwise to any set of three words of data using the following recursive definition: S : Znb 2b Znb
2b : x x = S(x) x3i = (x3i+2 x3i+1) x3i, x3i+1 = (x3i+2 x3i) x3i+1, x3i+2 = (x3i x3i+1) x3i+2, 0 i nb3 1, where and respectively represent the bitwise AND and OR. Word Rotation R: The word rotation is defined on nb-word vectors: R : Znb 2b Znb 2b : x y = R(x) yi+1 = xi, 0 i nb 2, y0 = xnb1
23
Bit Rotation r: The bit rotation is defined on nb-word vectors: r : Znb 2b Znb 2b : x y = r(x) y3i = x3i1, y3i+1 = x3i+1, y3i+2 = x3i+2 1, 0 i nb3 1, whereand represent the cyclic right and left shifts inside a word. Addition mod2b _: The mod 2b addition is defined on nb-word vectors: r : Znb 2b Znb 2b Znb 2b : x, y z = x _ y zi = xi _ yi, 0 i nb 1 The Round and Key Round Based on the previous definitions, the encrypt round FE, decrypt round FD and key round FK are pictured in Figure 1 and defined as the functions F : Z2 2n/2 Z2n/2 Z2 2n/2 such that: [Li+1,Ri+1] = FE(Li,Ri,Ki) _ Ki)_ [Li+1,Ri+1] = FD(Li,Ri,Ki) r_S(Ri _ Ki)__ [KLi+1,KRi+1] = FK(KLi,KRi, Ci) R_r_S(KRi _ Ci)__ Ri+1 = R(Li) r_S(Ri Li+1 = Ri Ri+1 = R1_Li Li+1 = Ri KRi+1=KLi KLi+1 = KRi
24
The Complete Cipher: The cipher iterates an odd number nr of rounds. The following pseudo-C code encrypts a plaintext P under a key K and produces a ciphertext C. P,C and K have a parametric bit size n. The operations within the cipher are performed considering parametric b-bit words. C=SEAn,b(P,K) { % initialization: L0&R0 = P; KL0&KR0 = K; % key scheduling: for i in 1 to _nr2_ [KLi,KRi] = FK(KLi1,KRi1, C(i)); switch KL_ nr for i in nr 2_, KR_ nr2_;2 to nr 1
25
% encryption: for i in 1 to nr2 [Li,Ri] = FE(Li1,Ri1,KRi1); for i in nr2 + 1 to nr [Li,Ri] = FE(Li1,Ri1,KLi1); % final: C = Rnr&Lnr ; switch KLnr1, KRnr1; }, where where & is the concatenation operator, KR _ nr2 _ is taken before the switch and C(i) is a nb-word vector of which all the words have value 0 excepted the LSW that equals i. Decryption is exactly the same, using the decrypt round FD. 3.2 Design Properties of the Components Substitution Box S: The substitution box was searched exhaustively in order to meet the following security and efficiency criteria: -parameter1: 1/2. -parameter2: 1/4. Maximum nonlinear order, namely 2. Recursive definition. Minimum number of instructions. Remark that, if 3-operand instructions are available, the recursive definition allows to perform the substitution box in 2 operations per word of data. As a comparison, the 3 3 bitwise substitution box used in 3-WAY [15] requires 3. The counterpart of this efficiency is the presence of two fixed points in the table. Bit and Word Rotations r and R: The cyclic rotations were defined in order to provide predictable low-cost diffusion within the cipher, when combined with the bitslice substitution box. It is illustrated in Figure 2 for a single substitution box scheme with parameters n = 48, b = 8, nb = 3. Looking at the figure, it can be seen that SEAn,b divides its data in 2nb 3 blocks of 3 words. The substitution box is applied in parallel to these blocks. Therefore, the diffusion process (starting with one single active bit in the left branch) is divided into two steps3: The first phase is obtained by the combination of the word rotation R (which is the only transform to provide inter-word diffusion) with the substitution box. It requires at most
26
nb rounds to be completed (in our example, nb = 3 which yields 3 rounds). Once every word has at least one active bit, the combination of r and S yields six more active bits per block in each round. Therefore, finishing the diffusion of all the blocks requires at most _b/2_ rounds. Combining these observations, the diffusion is complete after nb + _b/2_ rounds. Addition mod 2b _: Using a mod 2b key addition in place of a bitwise XOR was motivated by different reasons: (1) improvement of the diffusion process, (2) improvement of the non-linearity, (3) same cost/speed as the bitwise XOR in most processors, (4) necessity to avoid structural attacks. 3.3 Overall Structure: The overall structure of the cipher follows the Feistel strategy. However, a few points are specific to SEAn,b, namely the key schedule and the position of R, R1 in the encrypt/decrypt rounds.The key schedule is designed such that the master key is encrypted during half the rounds and decrypted during the other half. It allows to obtain a particular structure of the sequence of round keys such that the key expansion is exactly the same in encryption and decryption. Namely, we have: K0,K1,K2, . . . , K_ r 2 _,K_ r 2 _1, . . . , K2,K1,K0 As a consequence of this structure, the encryption/decryption rounds cannot keep the traditional Feistel structure: it would result in having identical encryption and decryption functions. This is the reason of moving the word rotation to the left branch of the Feistel round. 3.4 Security Analysis Resistance Against Known Attacks Linear and Differential Cryptanalysis: From the properties of the substitution box, we can compute bounds for the best linear and differential characteristics through the cipher. We first use the following lemma [29]:Lemma 1. Let f be the bijective nonlinear function of a 3-round Feistel cipher. Assuming that the linear parameter of f is smaller than and its differential parameter is smaller than , then the linear, differential parameters of the 3-round cipher , are respectively smaller than 2, 2. Since our nonlinear function S has parameter = 22 and parameter = 21, it implies that 3 rounds of SEAn,b have their linear and differential parameters respectively bounded by < 24 and < 22. However, for a n-bit block cipher, it is respectively required that _ 2n and _ 2n2 to resist against differential [4] and linear cryptanalysis [28]. In order to approach these bounds, we require that: 2nr/3 = _22_2nr/3 < 2n and 2nr/3 = _21_2nr/3< 2n2. (1) In both cases, the required number of rounds is: nr 3n/4. We note that we used a hybrid approach, between the provable security against linear and differential attacks that consists in bounding the parameter of the best differential/hull, like in lemma 1, and the usual heuristics to estimate the best linear/differential characteristic through a cipher (as
27
in the previous estimation for nr). In fact, the strategy of Equation (1) is similar to the one of e.g. the AES Rijndael [17], but we only assume one active s-box per round.
Extensions of Linear and Differential Cryptanalysis: Classical extensions of linear and differential cryptanalysis are non-linear approximations of outer rounds [26], bi-linear cryptanalysis [14], differential-linear cryptanalysis [27], multiple linear cryptanalysis [22, 10], boomerang [31] and rectangle [8] attack. However these extensions usually imply only a small improvement compared to the basic attacks. As a matter of fact, non-linear approximations of outer rounds allow to improve the bias of one or two rounds only. Regarding bi-linear cryptanalysis, we quote the author of [14]: For ciphers similar to DES, based on small substitution boxes, we claim that bilinear cryptanalysis is very closely related to LC, and we do not expect to find a bi-linear attack much faster than by LC. It is difficult to evaluate the efficiency of multiple linear cryptanalysis, but it seems more promising for big substitution boxes (as mentioned in [22]). Moreover the improvement on classical cryptanalysis obtained in [10] for the case of DES (which shares with SEAn,b a Feistel structure and a poor diffusion) is limited. Finally, the complexity of differential-linear cryptanalysis and of the boomerang attack and its variants is inherently greater than the one of the basic attacks. As an example, the boomerang (or rectangle) attack allows us to use two short differentials instead of a long one, but using a long differential with probability pq is in general highly preferable to applying a boomerang attack with two short differentials of probability p and q. Therefore although these attacks can perform slightly better in specific cases, the expected improvement is never outstanding.The conclusion is that these extensions actually deserve to be considered in the estimation of the number of rounds necessary to achieve security, but that a reasonable multiplicative factor should be enough to take them into account. A Dedicated Related-Key Attack Against a Modified Version. Forx Znb2b, we denote by xa the left rotation by a bits of each of the nb wordsof x. The non-linear and diffusion layers have the following properties: S(xa) = S(x)a r(xa) = r(x)a R(xa) = R(x)a Consider a modified version of our cipher where key addition is performed using rather than modular addition, and where all round constants Ci are such that Ci a = Ci, e.g. all Cis equal 0. As a consequence of the previous observations, the modified round F_E and the key round FK satisfy: F_E (La,Ra,K a) = F_E (L,R,K)a FK(KLa,KRa, 0) = FK(KL,KR, 0)a
28
These properties are iterative, in the sense that they also hold for the composition of several block cipher rounds. It is immediate to deduce from them a distinguisher on the modified cipher, which requires 2 chosen encryption queries under 2 related keys K and K a. In the actual SEAn,b, the key addition is performed word-wise mod 2b. As the property (X a) _ (K a) = (X _ K) a is prevented by certain carry propagations, it only holds with a probability p, which depends on a and the word size b. For a = 1, p rapidly converges to 3/8 as b grows. It is smaller for 1 < a < b1. Of course, this probability is averaged for all possible (X,K) and certain keys (e.g. all zeroes) yield no carry propagation at all. However, the design properties of the key schedule prevent SEAn,b from having such weak keys. Moreover the round constants Ci are generally not such that Ci a = Ci (because they are generated from a counter). Combined with the diffusion in the key schedule, it implies that the similarity between the round keys derived from K and those derived from K a rapidly vanishes. These properties avoid this structural distinguisher to be propagated through more than a few rounds of SEAn,b. Square Attacks: We explored square attacks [16] on SEA48,8. More precisely, we considered all possible sets of inputs to one branch of the Feistel structure, where the input to some of the substitution boxes is active (i.e. takes all possible input values the same number of times), and the input to the other substitution boxes is constant. The other branch is also constant. Therefore the number of plaintexts considered goes from 23 (when the input to only one substitution box is active) to 221 (when the input to 7 substitution boxes is active). Our experiments showed that square attacks do not allow to pass through more rounds than the diffusion pattern illustrated in Figure. It is expected that it remains the same when different parameters n and b are considered, which implies that nb + _b/2_ rounds are enough to prevent square attacks. Note that although our observations also hold for SEAn,b, the use of addition mod 2b provides better resistance against square attacks. Truncated and Impossible Differentials: As for square attacks, the diffusion analysis illustrated in Figure provides an estimation of the number of rounds required to prevent truncated differential attacks [25]. Impossible differentials[7] are usually built by concatenating two incompatible truncated differentials. As a consequence, we estimate the number of rounds necessary to prevent the construction of an impossible differential distinguisher as 2 (nb + _b/2_). Interpolation Attacks: The interpolation attack [21] is possible when the whole cipher can be written as a relatively simple algebraic expression. It requires the substitution box to have a compact expression, and the diffusion layer to permit the composition of these expressions. In the case of SEAn,b, there is a priori no such expression, and the bitwise diffusion would make the combination of algebraic expressions difficult anyway.
29
Slide Attacks: The sequence of round keys of SEAn,b is the same as the one of ICEBERG. Therefore the analysis done in [30] is still valid. Namely, the non periodicity of the sequence should make slide attacks [11, 12] irrelevant. The particular structure of this sequence also has some similarities with the one of GOST, of which the vulnerability against slide attacks is examined in [12]. None of the attacks presented in [12] seems to be applicable to our cipher. Related-Key Attacks: The first related-key attack has been described in [5]. It is the related-key counterpart of the slide attack. Such an attack is applicable when a round key Ki is computed from the previous round key Ki1 using a function f which is always the same: Ki = f(Ki1). However in the case of SEAn,b, a round constant that changes for each key round is used, which prevents this attack. Another type of related-key attack is the differential related key attack [23, 24]. The non-linearity of the SEAn,b key schedule should prevent it. Moreover, note that the improvement of the differential related-key attack over classical differential cryptanalysis usually results from the fact that choosing a given round key difference allows to counter the effect of the diffusion layer on the differential characteristic; a typical example is the attack on 3-WAY [24]. As the security of SEAn,b against differential cryptanalysis results from its large number of rounds rather than from its diffusion, this effect is notrelevant here. Complementation Properties: The DES has the following complementation property: if P KC denotes the fact that encryption of P under key K gives ciphertext C, then: P K C P K C. The non-linear key scheduling and the presence of carry propagations in the actual SEAn,b algorithm prevents this property. We are not aware of any other similar structural feature in the design. Algebraic Attacks: Algebraic attacks intend to exploit the simple algebraic structure of a block cipher. For example, certain block ciphers can be written as an overdefined system of quadratic equations. Reference [13] argues that a method called XSL might provide a way to effectively solve this type of equations and recover the key from a few plaintextciphertext pairs. Clearly, SEAn,b has a simple algebraic structure, as it is based on a 3-bit substitution box. Therefore, if such an attack practically applies to a cipher like Serpent [1], it is likely applicable to one of the versions of our routines. As the complexity of XSL is supposedly polynomial in the plaintext size and number of rounds, it is specially true when those values increase. However, as the criteria for these techniques to be successful are still being discussed [9], we did consider this latter point as a scope for further research. We note that resistance against algebraic attacks would anyway exclude the use of small substitution boxes and therefore the possibility to build very low cost encryption routines.
30
3.5 Performance Analysis: SEAn,b is targeted for being implemented on low-cost processors, with little code size and a small instruction set. However, SEAn,bs simple structure makes it easy to implement on any processor. In appendix, we propose a pseudo-assembly code of an encryption/decryption design with on the fly key scheduling. The implementation objectives were, in decreasing order of importance: (1) low RAM and registers usage, (2) low code size and (3) speed. It is based on the following (very) reduced instruction set (assuming 2-operand instructions only): Arithmetic and logic operators: , ,,_,,. Branch instructions: goto, subroutine call and return. Comparison, load RAM in register, store register in RAM.
According to the code in appendix, the performances can be roughly estimated as follows. First, the combined number of RAM words and registers equals 5nb + 3. Then, the code size and implementation time (both in expressed in ops.) is evaluated by summing the values given in appendix. For the code size, it directly yields 31nb+36 ops. For the implementation time, the round and key round respectively require 12nb + 11 ops. and 10nb + 11 ops. It yields a total of (nr 1) (12nb + 11 + 10nb + 11 + 7) + (12nb + 11) + 8nb + 7. These values are summarized in Table 1. Remark that, due to the particular structure of the key scheduling, we do not need to keep the master key in memory as, at the end of an encryption/decryption, we have Knr1 = K0. Remark also that this implementation uses a low number of registers, namely nb +3. However, if more registers are available, they can be traded for RAM words, which will result in lower code size and faster implementation.
31
For illustration purposes, we implemented SEAn, b on Atmel AVR ATtiny[3] And ARM [2] microprocessors. The Atmel ATtiny represents a typical target for such a low-cost encryption routine. We chose the ARM platform in order to provide rough comparisons between SEAn,b and the AES Rijndael. While direct comparisons are made difficult by their high dependencies on the target devices, the following general comments can be made: SEAn,b designs combine encryption and decryption more efficiently than most other encryption algorithms. In particular, key agility in decryption is usually not possible (e.g. for the AES Rijndael). The combined number of RAM words and registers of SEAn,b implementations (i.e. 5nb + 3) is generally lower than for other block ciphers. The code size of SEAn,b is generally lower than for other block ciphers implemented on similar platforms. The flexibility of SEAn,b also makes it less sensitive to the choice of a processor than fixed-sized algorithms, although it is obvious that large buses improve efficiency. The drawback of these limited resources is in the number of cycles required for the encryption (i.e. SEAn,b trades space for time, which may be relevant due to present processors speeds). Looking at the code size - cycles product, the efficiency of SEAn,b remains similar to the one of Rijndael (encryption only) that is well known for its efficient smart cards implementations.
32
CH:4 AN EXPOSITION OF THE SEA ALGORITHM The Schoof{Elkies{Atkin algorithm is an e_cient way to count the number of points on an elliptic curve de_ned over a large prime _eld. This expository paper describes the algorithm in su_cient detail to allow a reader not familiar with arithmetic geometry to implement the algorithm. The mathematical background for the technique is then given.Let p be a large (odd) prime and let E : y2 = x3 + a4x + a6 be an elliptic curve, where a4 and a6 are given _xed integers. In the case where p does not divide 4a34 +27a26 , E can be reduced to an elliptic curve over Fp. The number of points of E over Fp, denoted by #E(Fp), is of cryptographic interest, since the properties of this number determine the security of elliptic curve cryptosystems based on E against various known attacks. The _rst polynomial time algorithm for determining the number of rational points on an elliptic curve de_ned over a _nite _eld is due to Schoof. He used calculations with torsion points on the curve to arrive at the number of points. At _rst Schoof's algorithm was considered impractical, but Elkies suggested the use of \good" primes (now known as Elkies primes), where isogenies and modular curves can be involved to speed up the calculation. Atkin also made a number of important contributions to the algorithm, which then became known as the Schoof{Elkies{Atkin (SEA) algorithm. Further improvements were later proposed by Dewaghe and Couveignes{Dewaghe{Morain. The SEA algorithm was implemented by Morain, Muller, and Izu et al. Schoof's seminal paper [18] describes the original algorithm. He later also published a paper [19] that is a lovely overview of the developments in the subject up to 1995. Elkies' paper [9] describes the ideas of his original manuscript [8] and contains many other theoretical insights and illuminating examples. The implementations of Morain and Muller are described in [15] and [16]. The implementation of Izu, Kogure, Noro and Yokoyama, which focuses on speeding up the algorithm as much as possible, is described in [13]. Dewaghe's improvement is published in [7. The improvement by Couveignes{Dewaghe{Morain is published in [5]. Atkin never formally published his contributions described in [1], but they are discussed extensively in [9, 19]. This paper, which is not aimed at the experts in the area, describes in detail a reasonably fast implementation of the SEA algorithm that is closely modeled upon Morain's. The algorithm considered below is probabilistic and, for a 200-bit prime p, succeeds with a probability of about 3=4 (which can be brought arbitrarily close to 1 by enlarging the set A of auxiliary primes below). The algorithm implemented on a typical personal computer takes several minutes to _nd the number of points on a typical curve over Fp, where p has 200 bits. It is known that #E(Fp) = p + 1 t; where t is an integer which satis_es the Hasse bound 2pp _ t _ 2pp:
33
The algorithm works by calculating t modulo several small auxiliary primes `. When the product of the auxiliary primes exceeds 4pp, the Chinese Remainder Theorem is used to recover the exact value of t, and hence that of #E(Fp). The algorithm works its way though a _xed list of 40 candidates for auxiliary primes given below. For each candidate, a calculation has to be carried out to generate a certain polynomial ` that is necessary for further calculations with this `. These polynomials` do not depend on the curve E under consideration and hence might be precomputed and stored if memory allows. Then for any elliptic curve E we can quickly decide if our algorithm applies (the probability that the algorithm applies for a speci_c E and ` is 1=2). For those curves where the algorithm applies, we can determine t modulo `. When we _nished with all our candidates for the auxiliary primes, we can look at the elliptic curve and check whether the product of auxiliary primes that worked exceeds 4pp or not. In the former case, we succeeded in determining t. A typical application for this point counting would be to take a random prime p and a random elliptic curve E over Fp, with the intention of _nding an E with #E(Fp) = xr, where r is a prime and x is small. Given such a curve, a point P of order r can be located easily and the pair (E; P) could be used for a number of cryptographic algorithms, such as Di_e-Hellman key exchange, El Gamal encryption, etc. If we use 200-bit primes for p and require x _ 32, then the probability that #E = xr is about 2.5%, so we expect to have to run our algorithm on about 55 curves. Section 2 describes the algorithm in detail. Section 3 presents the mathematical background of the algorithm. Section 4 presents ideas by which the algorithm could be improved. Section 5 contains certain tables of data that need to be hardwired into a program implementing this algorithm.
The Algorithm
4.1 Overview: The set A of potential auxiliary primes is the union of the set As of small primes and the set Al of larger primes. For each ` 2 A, we need to determine a polynomial `(F; J) 2 Z[F; J]. For ` 2 As, this is stored in the program. For ` 2 Al, must be calculated by determining a number of coefficients of a certain q-series f(q) 2 Z[[q]] and carrying out certain algebraic operations on it. The polynomials do not depend on the elliptic curve under consideration and therefore may be pre-calculated and stored if there is enough space for them (they require just under a half megabyte to store). We start out with a given prime p and an elliptic curve E : y2 = x3 + a4x + a6:
34
35
Ke KeyI E/ E SM Data Plain Decryption IW SBox R C Dat DataI Key9 Key0 Cipher Encryption Rou Mo K KeyRe KeyReg[9 Round data XO W B SBo text data M yL DO n s l C aR O[95: st lk aL [95:0 Register Computational Block nd xxx d E g9[95: g8[95: g1[95: g0[95: 5:0] Reg R x od vr a t k Block d ] x Y 0] Reg C O M P U T A T I O N A L B L O C K
36
FIG: 5.1
Key generation is the process of generating keys for cryptography. A key is used to encrypt and decrypt whatever data is being encrypted/decrypted. Modern cryptographic systems include symmetric-key algorithms (such as DES and AES) and public-key algorithms (such as RSA). Symmetric-key algorithms use a single shared key; keeping data secret requires keeping this key secret. Public-key algorithms use a public key and a private key. The public key is made available to anyone (often by means of a digital certificate). A sender will encrypt data with the public key; only the holder of the private key can decrypt this data. Since public-key algorithms tend to be much slower than symmetrickey algorithms, modern systems such as TLS and SSH use a combination of the two: one party receives the other's public key, and encrypts a small piece of data (either a symmetric key or some data that will be used to generate it). The remainder of the conversation uses a (typically faster) symmetric-key algorithm for encryption. In computer cryptography keys are integers. In some cases keys are randomly generated using a random number generator (RNG) or pseudorandom number generator (PRNG), the latter being a computer algorithm that produces data which appears random under analysis. Of the PRNGs those which use system entropy to seed data generally produce better results, since this makes the initial conditions of the PRNG much more difficult for an attacker to guess. In other situations, the key is created using a passphrase and a key generation algorithm, usually involving a cryptographic hash function such as SHA-1. The simplest method to read encrypted data is a brute force attack simply attempting every number, up to the maximum length of the key.
37
Therefore, it is important to use a sufficiently long key length; longer keys take exponentially longer to attack, rendering a brute force attack impractical. Currently, key lengths of 128 bits (for symmetric key algorithms) and 1024 bits (for public-key algorithms) are common.
38
Cryptography:
Cryptography is the art and science of secret writing. The term is derived from the Greek language
5.2 Encryption:
Encryption is the actual process of applying cryptography. Much of cryptography is math oriented and uses patterns and algorithms to encrypt messages, text, words, signals and other forms of communication. Cryptography has many uses, especially in the areas of espionage, intelligence and military operations. Cryptography deals with all aspects of secure messaging, authentication, digital signatures, electronic money, and other applications. Today, many security systems and companies use cryptography to transfer information over the Internet or radio for fears of interception. Some of this encryption is highly advanced, however even simple encryption techniques can help uphold the privacy of any everyday person. The term cryptography also meant the breaking of encrypted messages until the early 1920s, when the concept of Cryptanalysis began being used and is now practically an art and science all on its own. The two main areas of cryptography are Cipher and Code. Code is one of the two major methods of cryptography. This method involves the replacement of complete words or phrases by code words or numbers. Cipher is the other major method of cryptography. This works on the principal of replacing individual letters by other numbers or letter. Cryptographic algorithms all perform the same basic function: They take two inputs a message and a key -- and transform them into a single output. There are two ways to perform this function. Encryption, as shown in Figure 1, uses the cryptographic key to transform the original message into an encrypted form. Decryption, as shown in Figure 2, does the reverse; it uses a cryptographic key to transform an encrypted message back into its original (a.k.a. plaintext) form.
39
FIG 5.3
Encryption Operation
40
5.3 DECRYPTION :
The process of decoding data that has been encrypted into a secret format. Decryption requires a secret key or password. It is a commonly held misconception that every encryption method can be broken. In connection with his WWII work at Bell Labs, Claude Shannon proved that the onetime pad cipher is unbreakable, provided the key material is truly random, never reused, kept secret from all possible attackers, and of equal or greater length than the message.[22] Most ciphers, apart from the one-time pad, can be broken with enough computational effort by brute force attack, but the amount of effort needed may be exponentially dependent on the key size, as compared to the effort needed to use the cipher. In such cases, effective security could be achieved if it is proven that the effort required (i.e., "work factor", in Shannon's terms) is beyond the ability of any adversary. This means it must be shown that no efficient method (as opposed to the time-consuming brute force method) can be found to break the cipher. Since no such showing can be made currently, as of today, the one-time-pad remains the only theoretically unbreakable cipher. There are a wide variety of cryptanalytic attacks, and they can be classified in any of several ways. A common distinction turns on what an attacker knows and what capabilities are available. In a ciphertext-only attack, the cryptanalyst has access only to the ciphertext (good modern cryptosystems are usually effectively immune to ciphertextonly attacks). In a known-plaintext attack, the cryptanalyst has access to a ciphertext and its corresponding plaintext (or to many such pairs). In a chosen-plaintext attack, the cryptanalyst may choose a plaintext and learn its corresponding ciphertext (perhaps many times); an example is gardening, used by the British during WWII. Finally, in a chosen-ciphertext attack, the cryptanalyst may be able to choose ciphertexts and learn their corresponding plaintexts.[10] Also important, often overwhelmingly so, are mistakes (generally in the design or use of one of the protocols involved; see Cryptanalysis of the Enigma for some historical examples of this). Cryptanalysis of symmetric-key ciphers typically involves looking for attacks against the block ciphers or stream ciphers that are more efficient than any attack that could be against a perfect cipher. For example, a simple brute force attack against DES requires one known plaintext and 255 decryptions, trying approximately half of the possible keys, to reach a point at which chances are better than even the key sought will have been found. But this may not be enough assurance; a linear cryptanalysis attack against DES requires 243 known plaintexts and approximately 243 DES operations.[23] This is a considerable improvement on brute force attacks. Public-key algorithms are based on the computational difficulty of various problems. The most famous of these is integer factorization (e.g., the RSA algorithm is based on a problem related to integer factoring), but the discrete logarithm problem is also important. Much public-key cryptanalysis concerns numerical algorithms for solving these computational problems, or some of them, efficiently (ie, in a practical time).
41
For instance, the best known algorithms for solving the elliptic curve-based version of discrete logarithm are much more time-consuming than the best known algorithms for factoring, at least for problems of more or less equivalent size. Thus, other things being equal, to achieve an equivalent strength of attack resistance, factoring-based encryption techniques must use larger keys than elliptic curve techniques. For this reason, public-key cryptosystems based on elliptic curves have become popular since their invention in the mid-1990s. While pure cryptanalysis uses weaknesses in the algorithms themselves, other attacks on cryptosystems are based on actual use of the algorithms in real devices, and are called side-channel attacks. If a cryptanalyst has access to, say, the amount of time the device took to encrypt a number of plaintexts or report an error in a password or PIN character, he may be able to use a timing attack to break a cipher that is otherwise resistant to analysis. An attacker might also study the pattern and length of messages to derive valuable information; this is known as traffic analysis,[24] and can be quite useful to an alert adversary. Poor administration of a cryptosystem, such as permitting too short keys, will make any system vulnerable, regardless of other virtues. And, of course, social engineering, and other attacks against the personnel who work with cryptosystems or the messages they handle (e.g., bribery, extortion, blackmail, espionage, torture, ...) may be the most productive attacks of all.
42
43
SIMULATION RESULTS
44
Encryption Results
45
Decryption Results
46
SYNTHESIS REPORTS
KEY INPUT:
RTL SCHEMATIC
GATE LEVEL
SYNTHESIS REPORT:
47
Release 6.1i - ngdbuild G.23 Copyright (c) 1995-2003 Xilinx, Inc. All rights reserved. Command Line: ngdbuild -intstyle ise -dd c:\xilinx\bin\vasu/_ngo -i -p xc2s15-cs144-6 keyreg.ngc keyreg.ngd Reading NGO file "c:/xilinx/bin/vasu/keyreg.ngc" ... Reading component libraries for design expansion... Checking timing specifications ... Checking expanded design ... NGDBUILD Design Results Summary: Number of errors: 0 Number of warnings: 0 Total memory usage is 37996 kilobytes Writing NGD file "keyreg.ngd" ... Writing NGDBUILD log file "keyreg.bld"... Release 6.1i Map G.23 Xilinx Mapping Report File for Design 'keyreg' Design Summary -------------Number of errors: 0 Number of warnings: 0 Logic Utilization: Logic Distribution: Number of Slices containing only related logic: 0 out of 0 0% Number of Slices containing unrelated logic: 0 out of 0 0% *See NOTES below for an explanation of the effects of unrelated logic Number of bonded IOBs: 194 out of 86 225% (OVERMAPPED) IOB Flip Flops: 96 Number of GCLKs: 1 out of 4 25% Number of GCLKIOBs: 1 out of 4 25% Total equivalent gate count for design: 768 Additional JTAG gate count for IOBs: 9,360 Peak Memory Usage: 57 MB
Design Information -----------------Command Line : C:/Xilinx/bin/nt/map.exe -intstyle ise -p xc2s15-cs144-6 -cm area -pr b -k 4 -c 100 -tx off -o keyreg_map.ncd keyreg.ngd keyreg.pcf Target Device : x2s15 Target Package : cs144 Target Speed : -6 Mapper Version : spartan2 -- $Revision: 1.16 $ase 6.1i Map G.23 Xilinx Mapping Report File for Design Mapped Date : Mon Mar 30 12:42:43 2009 Design Summary -------------Number of errors: 0 Number of warnings: 0 Logic Utilization: Logic Distribution: Number of Slices containing only related logic: 0 out of 0 0% Number of Slices containing unrelated logic: 0 out of 0 0% *See NOTES below for an explanation of the effects of unrelated logic Number of bonded IOBs: 194 out of 86 225% (OVERMAPPED) IOB Flip Flops: 96 Number of GCLKs: 1 out of 4 25% Number of GCLKIOBs: 1 out of 4 25% Total equivalent gate count for design: 768 Additional JTAG gate count for IOBs: 9,360 Peak Memory Usage: 57 MB
Number of errors: 0 Number of warnings: 0 Logic Utilization: Logic Distribution: Number of Slices containing only related logic: 0 out of 0 0% Number of Slices containing unrelated logic: 0 out of 0 0% *See NOTES below for an explanation of the effects of unrelated logic Number of bonded IOBs: 194 out of 86 225% (OVERMAPPED) IOB Flip Flops: 96 Number of GCLKs: 1 out of 4 25% Number of GCLKIOBs: 1 out of 4 25% Total equivalent gate count for design: 768 Additional JTAG gate count for IOBs: 9,360 Peak Memory Usage: 57 MB
KEY REGISTER:
Release 6.1i - xst G.23 Copyright (c) 1995-2003 Xilinx, Inc. All rights reserved. --> Parameter TMPDIR set to __projnav CPU : 0.00 / 0.67 s | Elapsed : 0.00 / 1.00 s --> Parameter xsthdpdir set to ./xst CPU : 0.00 / 0.67 s | Elapsed : 0.00 / 1.00 s --> Reading design: keyreg.prj TABLE OF CONTENTS 1) Synthesis Options Summary 2) HDL Compilation 3) HDL Analysis 4) HDL Synthesis 4.1) HDL Synthesis Report 5) Advanced HDL Synthesis 6) Low Level Synthesis 7) Final Report 7.1) Device utilization summary 7.2) TIMING REPORT
=============================================================== ========== * Synthesis Options Summary * =============================================================== ========== ---- Source Parameters
50
Input File Name Input Format Ignore Synthesis Constraint File Verilog Include Directory ---- Target Parameters Output File Name Output Format Target Device ---- Source Options Top Module Name Automatic FSM Extraction FSM Encoding Algorithm FSM Style RAM Extraction RAM Style ROM Extraction ROM Style Mux Extraction Mux Style Decoder Extraction Priority Encoder Extraction Shift Register Extraction Logical Shifter Extraction XOR Collapsing Resource Sharing Multiplier Style Automatic Register Balancing
: keyreg.prj : mixed : NO
: keyreg : NGC : xc2s15-6-cs144 : keyreg : YES : Auto : lut : Yes : Auto : Yes : Auto : YES : Auto : YES : YES : YES : YES : YES : YES : lut : No
---- Target Options Add IO Buffers : YES Global Maximum Fanout : 100 Add Generic Clock Buffer(BUFG) : 4 Register Duplication : YES Equivalent register Removal : YES Slice Packing : YES Pack IO Registers into IOBs : auto
---- General Options Optimization Goal Optimization Effort Keep Hierarchy Global Optimization
: Speed :1 : NO : AllClockNets
51
RTL Output : Yes Write Timing Constrain : NO Hierarchy Separator :_ Bus Delimiter : <> Case Specifier : maintain Slice Utilization Ratio : 100 Slice Utilization Ratio Delta :5 ---- Other Options lso : keyreg.lso Read Cores : YES cross_clock_analysi : NO verilog2001 : YES Optimize Instantiated Primitives : NO =============================================================== ========== WARNING:Xst:1885 - LSO file is empty, default list of libraries is used =============================================================== ========== * HDL Compilation * =============================================================== ========== Compiling vhdl file c:/xilinx/bin/vasu/KeyReg.vhd in Library work. Architecture keyreg of Entity keyreg is up to date. =============================================================== ========== * HDL Analysis * =============================================================== ========== Analyzing Entity <keyreg> (Architecture <keyreg>). Entity <keyreg> analyzed. Unit <keyreg> generated.
Related source file is c:/xilinx/bin/vasu/KeyReg.vhd. Found 96-bit register for signal <Dreg>. Summary: inferred 96 D-type flip-flop(s). Unit <keyreg> synthesized. =============================================================== ========== HDL Synthesis Report Macro Statistics # Registers 96-bit register :1 :1
=============================================================== ========== =============================================================== ========== * Advanced HDL Synthesis * =============================================================== ========== =============================================================== ========== * Low Level Synthesis * =============================================================== ========== Optimizing unit <keyreg> ... Loading device for application Xst from file '2s15.nph' in environment C:/Xilinx. Mapping all equations... Building and optimizing final netlist ... Found area constraint ratio of 100 (+ 5) on block keyreg, actual ratio is 28.
=============================================================== ========== * Final Report * =============================================================== ========== Final Results RTL Top Level Output File Name : keyreg.ngr Top Level Output File Name : keyreg Output Format : NGC
53
Optimization Goal Keep Hierarchy Design Statistics # IOs Macro Statistics : # Registers # 96-bit register : 195 :1 :1
: Speed : NO
Cell Usage : # BELS :1 # LUT1 :1 # FlipFlops/Latches : 96 # FDCE : 96 # Clock Buffers :1 # BUFGP :1 # IO Buffers : 194 # IBUF : 98 # OBUF : 96 =============================================================== ========== Device utilization summary: --------------------------Selected Device : 2s15cs144-6 Number of Slices: Number of Slice Flip Flops: Number of 4 input LUTs: Number of bonded IOBs: Number of GCLKs: 55 out of 192 28% 96 out of 384 25% 1 out of 384 0% 194 out of 90 215% (*) 1 out of 4 25%
=============================================================== ========== TIMING REPORT NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE. FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT GENERATED AFTER PLACE-and-ROUTE.
54
Clock Information: ----------------------------------------------------+------------------------+-------+ Clock Signal | Clock buffer(FF name) | Load | -----------------------------------+------------------------+-------+ Clk | BUFGP | 96 | -----------------------------------+------------------------+-------+ Timing Summary: --------------Speed Grade: -6 Minimum period: No path found Minimum input arrival time before clock: 7.962ns Maximum output required time after clock: 6.788ns Maximum combinational path delay: No path found Timing Detail: -------------All values displayed in nanoseconds (ns) ------------------------------------------------------------------------Timing constraint: Default OFFSET IN BEFORE for Clock 'Clk' Offset: 7.962ns (Levels of Logic = 1) Source: KeyEna (PAD) Destination: Dreg_95 (FF) Destination Clock: Clk rising Data Path: KeyEna to Dreg_95 Gate Net Cell:in->out fanout Delay Delay Logical Name (Net Name) ---------------------------------------- -----------IBUF:I->O 96 0.776 6.300 KeyEna_IBUF (KeyEna_IBUF) FDCE:CE 0.886 Dreg_0 ---------------------------------------Total 7.962ns (1.662ns logic, 6.300ns route) (20.9% logic, 79.1% route) -------------------------------------------------------------------------
Timing constraint: Default OFFSET OUT AFTER for Clock 'Clk' Offset: 6.788ns (Levels of Logic = 1) Source: Dreg_95 (FF) Destination: KeyO<95> (PAD) Source Clock: Clk rising Data Path: Dreg_95 to KeyO<95> Gate Net
55
Cell:in->out fanout Delay Delay Logical Name (Net Name) ---------------------------------------- -----------FDCE:C->Q 1 1.085 1.035 Dreg_95 (Dreg_95) OBUF:I->O 4.668 KeyO_95_OBUF (KeyO<95>) ---------------------------------------Total 6.788ns (5.753ns logic, 1.035ns route) (84.8% logic, 15.2% route) =============================================================== ========== CPU : 3.59 / 4.64 s | Elapsed : 4.00 / 5.00 s --> Total memory usage is 54400 kilobytes
SBOX:
56
RTL SCHEMATIC
GATE LEVEL
---- Source Parameters Input File Name Input Format Ignore Synthesis Constraint File Verilog Include Directory ---- Target Parameters Output File Name Output Format Target Device
: sbox8x3.prj : mixed : NO :
: sbox8x3 : NGC : xc2s15-6-cs144 : sbox8x3 : YES : Auto : lut : Yes : Auto : Yes : Auto : YES : Auto : YES : YES : YES : YES : YES : YES : lut : No : YES : 100 :4 : YES : YES : YES : auto
---- Source Options Top Module Name Automatic FSM Extraction FSM Encoding Algorithm FSM Style RAM Extraction RAM Style ROM Extraction ROM Style Mux Extraction Mux Style Decoder Extraction Priority Encoder Extraction Shift Register Extraction Logical Shifter Extraction XOR Collapsing Resource Sharing Multiplier Style Automatic Register Balancing
---- Target Options Add IO Buffers Global Maximum Fanout Add Generic Clock Buffer(BUFG) Register Duplication Equivalent register Removal Slice Packing Pack IO Registers into IOBs
: Speed :1 : NO
58
Global Optimization RTL Output Write Timing Constraint Hierarchy Separator Bus Delimiter Case Specifier Slice Utilization Ratio Slice Utilization Ratio Delta
---- Other Options lso : sbox8x3.lso Read Cores : YES cross_clock_analysi : NO verilog2001 : YES Optimize Instantiated Primitives : NO =============================================================== ========== WARNING:Xst:1885 - LSO file is empty, default list of libraries is used =============================================================== ========== * HDL Compilation * =============================================================== ========== Compiling vhdl file c:/xilinx/bin/vasu/KeyReg.vhd in Library work. Architecture sbox8x3 of Entity sbox8x3 is up to date. =============================================================== ========== * HDL Analysis * =============================================================== ========== Analyzing Entity <sbox8x3> (Architecture <sbox8x3>). INFO:Xst:1561 - c:/xilinx/bin/vasu/KeyReg.vhd line 29: Mux is complete : default of case is discarded Entity <sbox8x3> analyzed. Unit <sbox8x3> generated
Related source file is c:/xilinx/bin/vasu/KeyReg.vhd. Unit <sbox8x3> synthesized. =============================================================== ========= HDL Synthesis Report Found no macro =============================================================== ========== =============================================================== ========== * Advanced HDL Synthesis * =============================================================== ========== =============================================================== ========== * Low Level Synthesis * =============================================================== ========== Optimizing unit <sbox8x3> ... Loading device for application Xst from file '2s15.nph' in environment C:/Xilinx. Mapping all equations... Building and optimizing final netlist ... Found area constraint ratio of 100 (+ 5) on block sbox8x3, actual ratio is 1.
=============================================================== ========== * Final Report * =============================================================== ========== Final Results RTL Top Level Output File Name : sbox8x3.ngr
60
Top Level Output File Name : sbox8x3 Output Format : NGC Optimization Goal : Speed Keep Hierarchy : NO Design Statistics # IOs :7
Cell Usage : # BELS :3 # LUT4 :3 # IO Buffers :7 # IBUF :4 # OBUF :3 =============================================================== Device utilization summary: --------------------------Selected Device : 2s15cs144-6 Number of Slices: Number of 4 input LUTs: Number of bonded IOBs: 2 out of 192 3 out of 384 7 out of 90 1% 0% 7%
Total memory usage is 53376 kilobytes TRANSLATION REPORT: Checking timing specifications ... Checking expanded design ... NGDBUILD Design Results Summary: Number of errors: 0 Number of warnings: 0 Total memory usage is 37996 kilobytes
61
FLOOR PLANNING
62
MAPPING REPORT:
Design Summary -------------Number of errors: 0 Number of warnings: 0 Logic Utilization: Number of 4 input LUTs: 3 out of 384 1% Logic Distribution: Number of occupied Slices: 2 out of 192 1% Number of Slices containing only related logic: 2 out of 2 100% Number of Slices containing unrelated logic: 0 out of 2 0% *See NOTES below for an explanation of the effects of unrelated logic Total Number of 4 input LUTs: 3 out of 384 1% Number of bonded IOBs: 7 out of 86 8% Total equivalent gate count for design: 18 Additional JTAG gate count for IOBs: 336 Peak Memory Usage: 56 MB
Maping Report: Device utilization summary: Number of External IOBs 7 out of 86 8% Number of LOCed External IOBs 0 out of 7 0% Number of SLICEs 2 out of 192 1%
The NUMBER OF SIGNALS NOT COMPLETELY ROUTED for this design is: 0 The AVERAGE CONNECTION DELAY for this design is: 0.871 The MAXIMUM PIN DELAY IS: 1.512 The AVERAGE CONNECTION DELAY on the 10 WORST NETS is:
0.707
63
KEY GENERATION:
RTL SCHEMATIC
64
GATE LEVEL
65
=============================================================== ========== * Synthesis Options Summary * =============================================================== ========== ---- Source Parameters Input File Name : keygenblock.prj Input Format : mixed Ignore Synthesis Constraint File : NO Verilog Include Directory : ---- Target Parameters Output File Name Output Format Target Device ---- Source Options Top Module Name Automatic FSM Extraction FSM Encoding Algorithm FSM Style RAM Extraction RAM Style ROM Extraction ROM Style Mux Extraction Mux Style Decoder Extraction Priority Encoder Extraction Shift Register Extraction Logical Shifter Extraction XOR Collapsing Resource Sharing Multiplier Style Automatic Register Balancing : keygenblock : NGC : xc2s15-6-cs144 : keygenblock : YES : Auto : lut : Yes : Auto : Yes : Auto : YES : Auto : YES : YES : YES : YES : YES : YES : lut : No : YES : 100 :4 : YES : YES : YES : auto
---- Target Options Add IO Buffers Global Maximum Fanout Add Generic Clock Buffer(BUFG) Register Duplication Equivalent register Removal Slice Packing Pack IO Registers into IOBs
66
---- General Options Optimization Goal Optimization Effort Keep Hierarchy Global Optimization RTL Output Write Timing Constraints Hierarchy Separator Bus Delimiter Case Specifier Slice Utilization Ratio Slice Utilization Ratio Delta
---- Other Options lso : keygenblock.lso Read Cores : YES cross_clock_analysis : NO verilog2001 : YES Optimize Instantiated Primitives : NO TRANSLATION REPORT: Release 6.1i - ngdbuild G.23 Copyright (c) 1995-2003 Xilinx, Inc. All rights reserved. Command Line: ngdbuild -intstyle ise -dd c:\xilinx\bin\vasu/_ngo -i -p xc2s15-cs144-6 keygenblock.ngc keygenblock.ngd Reading NGO file "c:/xilinx/bin/vasu/keygenblock.ngc" ... Reading component libraries for design expansion... Checking timing specifications ... Checking expanded design ... NGDBUILD Design Results Summary: Number of errors: 0 Number of warnings: 0 Total memory usage is 42092 kilobytes Writing NGD file "keygenblock.ngd" ... Writing NGDBUILD log file "keygenblock.bld"...
67
MAPPING REPORT: Design Summary -------------Number of errors: 0 Number of warnings: 0 Logic Utilization: Total Number Slice Registers: 419 out of 384 109% (OVERMAPPED) Number used as Flip Flops: 415 Number used as Latches: 4 Number of 4 input LUTs: 1,016 out of 384 264% (OVERMAPPED) Logic Distribution: Number of occupied Slices: 665 out of 192 346% (OVERMAPPED) Number of Slices containing only related logic: 648 out of 665 97% Number of Slices containing unrelated logic: 17 out of 665 2% *See NOTES below for an explanation of the effects of unrelated logic Total Number 4 input LUTs: 1,066 out of 384 277% (OVERMAPPED) Number used as logic: 1,016 Number used as a route-thru: 50 Number of bonded IOBs: 1,060 out of 86 1232% (OVERMAPPED) IOB Flip Flops: 960 Number of GCLKs: 1 out of 4 25% Number of GCLKIOBs: 1 out of 4 25% Total equivalent gate count for design: 17,572 Additional JTAG gate count for IOBs: 50,928 Peak Memory Usage: 72 MB
68
ENCRYPTION:
RTL SCHEMATIC
69
GATE LEVEL
70
SYNTHESIS REPORT: =============================================================== ========== * Synthesis Options Summary * =============================================================== ========== ---- Source Parameters Input File Name : encryption.prj Input Format : mixed Ignore Synthesis Constraint File : NO Verilog Include Directory : ---- Target Parameters Output File Name Output Format Target Device : encryption : NGC : xc2s15-6-cs144 : encryption : YES : Auto : lut : Yes : Auto : Yes : Auto : YES : Auto : YES : YES : YES : YES : YES : YES : lut : No : YES : 100 :4 : YES : YES : YES : auto
---- Source Options Top Module Name Automatic FSM Extraction FSM Encoding Algorithm FSM Style RAM Extraction RAM Style ROM Extraction ROM Style Mux Extraction Mux Style Decoder Extraction Priority Encoder Extraction Shift Register Extraction Logical Shifter Extraction XOR Collapsing Resource Sharing Multiplier Style Automatic Register Balancing
---- Target Options Add IO Buffers Global Maximum Fanout Add Generic Clock Buffer(BUFG) Register Duplication Equivalent register Removal Slice Packing Pack IO Registers into IOBs
71
---- General Options Optimization Goal Optimization Effort Keep Hierarchy Global Optimization RTL Output Write Timing Constraints Hierarchy Separator Bus Delimiter Case Specifier Slice Utilization Ratio Slice Utilization Ratio Delta
---- Other Options lso : encryption.lso Read Cores : YES cross_clock_analysis : NO verilog2001 : YES Optimize Instantiated Primitives : NO
Translation Report: NGDBUILD Design Results Summary: Number of errors: 0 Number of warnings: 0 Total memory usage is 42092 kilobytes
72
DECRYPTION:
GATE LEVEL
73
SYNTHESIS REPORT: =============================================================== ========== * Synthesis Options Summary * =============================================================== ========== ---- Source Parameters Input File Name : decryption.prj Input Format : mixed Ignore Synthesis Constraint File : NO Verilog Include Directory : ---- Target Parameters Output File Name Output Format Target Device ---- Source Options Top Module Name Automatic FSM Extraction FSM Encoding Algorithm FSM Style RAM Extraction RAM Style ROM Extraction ROM Style Mux Extraction Mux Style Decoder Extraction Priority Encoder Extraction Shift Register Extraction Logical Shifter Extraction XOR Collapsing Resource Sharing Multiplier Style Automatic Register Balancing : decryption : NGC : xc2s15-6-cs144 : decryption : YES : Auto : lut : Yes : Auto : Yes : Auto : YES : Auto : YES : YES : YES : YES : YES : YES : lut : No
---- Target Options Add IO Buffers : YES Global Maximum Fanout : 100 Add Generic Clock Buffer(BUFG) : 4 Register Duplication : YES Equivalent register Removal : YES Slice Packing : YES Pack IO Registers into IOBs : auto
74
---- General Options Optimization Goal Optimization Effort Keep Hierarchy Global Optimization RTL Output Write Timing Constraints Hierarchy Separator Bus Delimiter Case Specifier Slice Utilization Ratio Slice Utilization Ratio Delta
---- Other Options lso : decryption.lso Read Cores : YES cross_clock_analysis : NO verilog2001 : YES Optimize Instantiated Primitives : NO
Translation Report: NGDBUILD Design Results Summary: Number of errors: 0 Number of warnings: 0 Total memory usage is 45122 kilobytes
75
76
ADVANTAGES
SEA is parametric in text, key and processor size. It is a low cost encryption routine targeted for the processors with limited instruction set. It is a small encryption routine targeted to any given processor , the security of the cipher being adapted in function of its key size. It is also used in applications where the same constrained device has to perform both encryption and decryption
APPLICATIONS
This is a low-cost encryption routine basically designed for processors with a limited instruction set. In wireless communication and mobile computing and networking systems. For the encryption of JPEG2000 images. In scalable video coding .
77
CONCLUSION
SEAn,b is a scalable encryption algorithm targeted for small embedded applications. The plaintext size, key size and processor (or word) size are parameters of the design. The structure of SEAn,b allows a fast evaluation of the cipher efficiency on any RISC machine. Its typical performances (encryption + decryption) for present key sizes and processors (e.g. 128-bit key, 1 Mhz 8-bit RISC) are in the range of an encryption/decryption in a few milliseconds, using a few hundreds bytes of ROM. One additional advantage of the design is its extreme simplicity. Based on the pseudo code provided in this paper, it is expected that the implementation of the cipher in assembly can be done within a few hours. We note finally that the design criteria of SEAn,b do not make it a conservative algorithm by nature. Further cryptanalysis efforts are consequently required. This paper presented FPGA implementations of a scalable encryption algorithm for various sets of parameters. The presented parametric architecture allows keeping the flexibility of the algorithm by taking advantage of generic VHDL coding. It executes one round per clock cycle, computes the round and the key round in parallel and supports both encryption and decryption at a minimal cost. Compared to other recent block ciphers, SEA exhibits a very small area utilization that comes at the cost of a reduced throughput. Consequently, it can be considered as an interesting alternative for constrained environments. Scopes for further research include low power ASIC implementations purposed for RFIDs as well as further cryptanalysis efforts and security evaluations.
Bibliography
78
Reference books: Basic VLSI design, 3rd Edition A VHDL Primer Digital Design Data and Computer Communications Computer Networks Network Cryptology Douglas A.Pucknell, Kamran Eshraghian J. Bhaskar Morris Mano William Stalling Andrew S. Tannenbaum William Stalling
79