This document provides instructions for Project 1 of the ECE 406 course at NC State University. The goal of the project is to design a simplified LC3 microcontroller. Students must implement the control and datapath units, including modules for the controller, fetch, execute, and memory access. The controller is a finite state machine that directs dataflow. Fetch generates the program counter. Execute performs ALU and address computations. Memory access interfaces with external memory. Students must submit Verilog code that passes test cases.
This document provides instructions for Project 1 of the ECE 406 course at NC State University. The goal of the project is to design a simplified LC3 microcontroller. Students must implement the control and datapath units, including modules for the controller, fetch, execute, and memory access. The controller is a finite state machine that directs dataflow. Fetch generates the program counter. Execute performs ALU and address computations. Memory access interfaces with external memory. Students must submit Verilog code that passes test cases.
This document provides instructions for Project 1 of the ECE 406 course at NC State University. The goal of the project is to design a simplified LC3 microcontroller. Students must implement the control and datapath units, including modules for the controller, fetch, execute, and memory access. The controller is a finite state machine that directs dataflow. Fetch generates the program counter. Execute performs ALU and address computations. Memory access interfaces with external memory. Students must submit Verilog code that passes test cases.
This document provides instructions for Project 1 of the ECE 406 course at NC State University. The goal of the project is to design a simplified LC3 microcontroller. Students must implement the control and datapath units, including modules for the controller, fetch, execute, and memory access. The controller is a finite state machine that directs dataflow. Fetch generates the program counter. Execute performs ALU and address computations. Memory access interfaces with external memory. Students must submit Verilog code that passes test cases.
ECE Department Design of Complex Digital Systems W. Rhett Davis
Project #1: LC3 Microcontroller
1. Introduction The main purpose of this project is to let you start dealing with more complex designs, and become familiar with some of the elements used within a CPU.
2. Learning Objectives Complete a design involving separate control and datapath with multiple modules Complete a design that includes most of the elements to be used in the CPU
3. Project Report You are expected to turn in a report after the end of this project. Follow the project report format given on the Laboratories page on the course web-site. Be sure to include all items listed in that report format for full credit.
4. Wolfware Submission You also need to submit your Verilog code electronically through Wolfware as proj1.v. This file should contain a module called SimpleLC3, as described below. It may use the include directive to include other files, if you wish, but they must also be submitted with Wolfware. Your code must successfully execute with the test-bench provided on the course web-site (called proj1test.v) and give the correct expected output as listed in the test-bench. The memory file (proj1.dat) is also given, along with the expected output of the instruction set simulator (proj1.out). This program is taken from the example given in class. In addition, a second program will be used to test your code that will not be provided.
5. Lab Design: simplified LC3 microcontroller The microcontroller designed in this lab assignment is a simplified version of the original LC3 microcontroller. Specifically, four simplifications are considered as follows: 1) A smaller instruction set: the ISA you need to implement does NOT contain the control instructions RTI and TRAP. All other instructions must be implemented. 2) No off-chip memory: The instructions of the program are assumed to be in the cache. 3) The programs consist of valid instructions ONLY, i.e., you do not have to perform error checking to detect bad instructions 4) No overflow detection is required.
5.1 Top-level module:
modul e Si mpl eLC3( cl ock, r eset , addr , di n, dout , r d, compl et e) ; i nput cl ock; / / Gl obal syst emcl ock i nput r eset ; / / Gl obal syst emr eset out put [ 15: 0] addr , di n; / / Addr ess and Dat a- I n l i nes f or Memor y i nput [ 15: 0] dout ; / / Dat a- Out l i nes f r omMemor y out put r d; / / Memr oy si gnal t o i ndi cat e r ead or wr i t e i nput compl et e; / / Si gnal t o i ndi cat e compl et i on of r ead/ wr i t e NC State University ECE 406 Spring 2009 ECE Department Design of Complex Digital Systems W. Rhett Davis SimpleLC3 Schematic
s h a r e d a d d r e s s b u s s h a r e d r e a d l i n e
NC State University ECE 406 Spring 2009 ECE Department Design of Complex Digital Systems W. Rhett Davis As shown in the schematic on the previous page, the top-level module is instantiated in the test- bench along with the memory.
Special Signals: The SimpleLC3 module should be connected exactly as shown in the schematic. Note that there is a shared read-line and a shared address-bus for the memory, which means that these signals will be driven from two sources. In addition, note that VSR2 field (which in this schematic is the least significant 16 bits of the the D_Data signal to the Execute block) goes into the MemAccess block as the M_Data signal. All other signals are simple inputs and outputs.
The specifications for the Controller, Fetch and Execute blocks were given in Homeworks #5, and #6. The specifications for the remaining 3 blocks are given here.
5.2 Controller
modul e Cont r ol l er ( cl ock, r eset , st at e, C_Cont r ol , compl et e) ; i nput cl ock, r eset ; / / syst emcl ock and r eset out put [ 3: 0] st at e; / / syst emst at e i nput [ 6: 0] C_Cont r ol ; / / cont r ol f r omDecode i nput compl et e; / / compl et e f r omMemor y
The Controller module is a finite state machine that controls the dataflow and therefore the execution of all the instructions in the microcontroller. The state transition diagram sketch is given below, in which the vertices represents states with the corresponding operations described inside. The transitions are denoted by the edges. The condition of each transition is determined by the current state and/or input signal C_Control generated by the decoder module. The C_Control can be broken down further into 4 fields as follows.
The self-pointing edges are used for Project 2 to cope with the memory latency. Such looping transitions only occur when the complete signal is zero, which never happens in Project 1. State transitions occur and only occur at positive edges of the clock signal. When the reset signal is high, the next-state should be the Fetch Instruction state.
NC State University ECE 406 Spring 2009 ECE Department Design of Complex Digital Systems W. Rhett Davis Update PC Compute Memory Address Fetch Instruction Decode Write Memory Execute ALU Operations Compute Target PC Update Register File Indirect Address Read Read Memory complete==0 complete==0 complete==0 Instruction Type==0 Instruction Type==1 Instruction Type==2 Invalid State Store PC==0 Store PC==1 Memory Access Mode==3 Memory Access Mode==2 complete==0 Memory Access Mode==0 Memory Access Mode==1 Load==1 Load==0 complete==1 complete==1 complete==1
5.3 Fetch
modul e Fet ch( cl ock, r eset , st at e, pc, npc, r d, t addr , br _t aken) ; i nput cl ock; / / syst emcl ock i nput r eset ; / / syst emr eset i nput br _t aken; / / si gnal f r omdecoder , 1 means br anch t aken i nput [ 15: 0] t addr ; / / t ar get addr ess of cont r ol i nst r uct i ons i nput [ 3: 0] st at e; / / syst emst at e f r omcont r ol l er out put [ 15: 0] pc, npc; / / cur r ent PC and next PC, i . e. , pc+1 out put r d; / / memor y r ead cont r ol si gnal
Fetch module is used to generate the program counter, which contains the address of the instruction to be fetched. The PC should be updated on the rising edge of the clock. Also, the PC should be updated only when the system is in the Update PC state, as determined by the Controller block. The signal rd should be high-impedance during the Read Memory, Write Memory, and Indirect Address Read states, because the MemAccess block will drive the shared memory bus during these cycles. In all other states, this signal should be high. pc is the memory address and should be high-impedance at the same times that rd is high-impedance. The first program instruction is located at the address 16h3000. Therefore, pc should be set to 16h3000 when reset is high. The block diagram of Fetch module is shown below.
NC State University ECE 406 Spring 2009 ECE Department Design of Complex Digital Systems W. Rhett Davis br_taken 16'h3000 reset PC clock + npc pc 1 state == Update PC 0 1 taddr rd 1 state state !=Read Memory && state !=Write Memory && state !=Indirect Address Read 1 0 1 0
The relation among Fetch, Controller, and off-chip memory module is shown below.
clock complete state[3:0] C_Control[5:0] reset Controller clock reset state[3:0] pc[15:0] npc[15:0] rd taddr[15:0] br_taken Fetch clock addr[15:0] dout[15:0] complete din[15:0] rd Memory global reset global clock global reset global clock global clock to Writeback & Execute from MemAccess from Execute from Execute from Decode to MemAccess & Decode
NC State University ECE 406 Spring 2009 ECE Department Design of Complex Digital Systems W. Rhett Davis 5.4 Execute
modul e Execut e( E_cont r ol , D_dat a, al uout , pcout , npc) ;
i nput [ 5: 0] E_cont r ol ; / / cont r ol si gnal s f r omDecode i nput [ 47: 0] D_dat a; / / dat a f r omDecode out put [ 15: 0] al uout ; / / out put of ALU out put [ 15: 0] pcout ; / / out put of t he addr ess comput at i on adder i nput [ 15: 0] npc; / / next PC f r omFet ch
Execute module performs the arithmetic and logical instructions, target PC computation, and memory address computation. The E_Control input is an aggregate of the ALU Operation Sel, OP 2 Sel, PC Sel 1, and PC Sel 2. The D_Data input is an aggregate of the IR, VSR1 and VSR2 signals. The block diagram is given below with the ALU and extension modules in homework 3. Note that overflow checking is not being done, so the alucarry output of the ALU is ignored.
E_Control 0 1 2 3 4 5 ALU Operation Sel PC Sel 1 PC Sel 2 OP 2 Sel VSR1 VSR2 IR 32 47 ... 16 31 ... 0 15 ... D_Data NC State University ECE 406 Spring 2009 ECE Department Design of Complex Digital Systems W. Rhett Davis
5.5 MemAccess
modul e MemAccess( st at e, M_Cont r ol , M_Dat a, M_Addr , memout , addr , di n, dout , r d) ; i nput [ 3: 0] st at e; / / Syst emSt at e f r omCont r ol l er i nput M_Cont r ol ; / / Cont r ol Si gnal t o i ndi cat e addr ess f r omdout i nput [ 15: 0] M_Dat a; / / Dat a f or st or e oper at i ons i nput [ 15: 0] M_Addr ; / / Addr ess f or l oad/ st or e oper at i ons out put [ 15: 0] addr ; / / Addr ess l i nes t o memor y out put [ 15: 0] di n; / / Dat a- i n l i nes t o memor y out put r d; / / Memor y si gnal t o i ndi cat e r ead or wr i t e i nput [ 15: 0] dout ; / / Dat a- Out l i nes f r omMemor y out put [ 15: 0] memout ; / / Dat a r ead f r omMemor y t o wr i t e i n r egi st er f i l e
The MemAccess block is the master of the shared memory bus during the Read Memory, Write Memory, and Read Indirect Address states. It should setup the memory bus lines as follows: Read Memory rd should be 1 and din doesnt matter. addr should be set to either M_Addr or dout, depending on M_Control. addr should be set to dout in this state only if the opcode shows an LDI operation. Write Memory rd should be 0 and din should be M_Data. addr should be set to either M_Addr or dout, depending on M_Control. addr should be set to dout in this state only if the opcode shows an STI operation. Read Indirect Address - rd should be 1 and din doesnt matter. addr should be set to M_Addr.
The memout signal should always pass the value of dout through to the Writeback block.
5.6 Writeback
modul e Wr i t eback( W_Cont r ol , al uout , memout , pcout , npc, DR_i n) ; i nput [ 15: 0] al uout , memout , pcout , npc; / / Possi bl e dat a t o st or e i nput [ 1: 0] W_Cont r ol ; / / Cont r ol si gnal t o choose what wi l l be wr i t t en out put [ 15: 0] DR_i n; / / Dat a t hat wi l l be st or ed i n t he r egi st er - f i l e
The Writeback block should set the DR_in lines to the value to be written into the register-file. This value is selected from the following four choices: aluout The output of the ALU in the Execute block pcout The computed memory address output of the Execute block npc The next value of the program counter from the Fetch block memout The value read from memory, from the MemAccess block
The W_Control signal will be used to select between these possibilities. The schematic is shown below.
NC State University ECE 406 Spring 2009 ECE Department Design of Complex Digital Systems W. Rhett Davis aluout[15:0] pcout[15:0] npc[15:0] memout[15:0] W_Control[1:0] DR_in[15:0]
5.7 Decode
modul e Decode( cl ock, st at e, dout , C_Cont r ol , E_Cont r ol , M_Cont r ol , W_Cont r ol , F_Cont r ol , D_Dat a, DR_i n) ; i nput cl ock; / / Gl obal syst emcl ock i nput [ 3: 0] st at e; / / Syst emst at e f r omCont r ol l er i nput [ 15: 0] dout ; / / Dat a- out l i nes f r omMemor y i nput [ 15: 0] DR_i n; / / Dat a t o be wr i t t en t o Regi st er - Fi l e out put M_Cont r ol ; / / MemAccess cont r ol l i ne out put [ 1: 0] W_Cont r ol ; / / Wr i t eback cont r ol l i nes out put [ 5: 0] C_Cont r ol ; / / Cont r ol l er cont r ol l i nes out put [ 5: 0] E_Cont r ol ; / / Execut e cont r ol l i nes out put [ 47: 0] D_Dat a; / / Dat a f or Execut e and MemAccess bl ocks out put F_Cont r ol ; / / Fet ch cont r ol l i ne
The Decode block contains the logic illustrated in the schematic below. It contains an instruction register (IR) that stores the current instruction during the Decode state. It contains a program status register (PSR) that stores the status of the last value written to the register file (positive, negative, or zero) and is update only on the Update Register File state. Lastly, it contains a register file that can read two locations on one cycle and write to one location in the same cycle. However, the register file writes only during the Update Register File state.
Based on the contents of IR and PSR, the decode block generates all of the control signals for the other blocks (C_Control, M_Control, W_Control, E_Control, and F_Control) as well as the source and destination addresses in the register file (sr1, sr2, and dr). Note, however, that the instruction type field of the C_Control signal must be valid during the Decode state and will therefore not be valid if this field is computed from the contents of IR. Therefore, the instruction type field is computed from the memory output, which makes it valid during the Decode state (but not necessarily the states after Decode).
NC State University ECE 406 Spring 2009 ECE Department Design of Complex Digital Systems W. Rhett Davis W_Control[1:0] M_Control E_Control[5:0] D_Data[47:0] dout[15:0] F_Control C_Control[5:0] DR_in[15:0] state[3:0] ==Update Register File sr1[2:0] sr2[2:0] dr[2:0] VSR1[15:0] VSR2[15:0] en DR_in[15:0] RegFile P S R I R PSR update logic [15:0] [31:16] [47:32] Main decode logic [3:0] Instruction Type decode logic [5:4] state[3:0] ==Decode en en ==Update Register File
When writing the Verilog code for the main decode combinational logic, it is helpful to first create a table that has the values needed for each control signal in order to set up the micro- controller to implement each operation. Empty tables are provided below for your convenience.
C_Control Operation Mode Instr. Type Store PC Mem. Access Mode load M_Control ADD 0 1 AND 0 1 NOT BR J MP/RET J SR J SRR LD LDR LDI LEA ST STR STI NC State University ECE 406 Spring 2009 ECE Department Design of Complex Digital Systems W. Rhett Davis
E_Control Operation Mode ALU Op Sel PC Sel 1 PC Sel 2 Op 2 Sel W_Control F_Control ADD 0 1 AND 0 1 NOT BR J MP/RET J SR J SRR LD LDR LDI LEA ST STR STI
Operation Mode DR SR1 SR2 ADD 0 1 AND 0 1 NOT BR J MP/RET J SR J SRR LD LDR LDI LEA ST STR STI
The RegFile block inside the Decode block can be conceptually thought of as the schematic below. In the schematic, the outputs of eight 16-bit registers fan-out to two 8-to-1 MUXes, which are used to determine the VSR1 and VSR2 signals, depending on the sr1 and sr2 select lines. The DR_in input fans-out to the data-inputs of all eight registers. Each register has an enable input that determines if it will load the input value, and these enable inputs are connected to a 1-to-8 decoder that passes the master enable signal to one register, depending on the dr input.
NC State University ECE 406 Spring 2009 ECE Department Design of Complex Digital Systems W. Rhett Davis
You may want to implement the RegFile with a memory, rather than eight individual registers. The code will be much simpler and easy to understand. However, the waveform capture formats typically dont store the values in memories. If you want to view the contents of your register file in a waveform viewer, you may want to have a set of eight assign statements, such as the following:
assi gn R0=mem[ 0] ; assi gn R1=mem[ 1] ; . . .
This approach will not affect your synthesis results. You should still get 168=128 flip-flops when synthesizing.