MIPS Report File

M.
TECH (VLSI DESIGN)

Semester II
EC 728
MINOR PROJECT
Submitted by
MANISH (212VL022)
Under the guidance of

DR. RATHNAMALA RAO
Department of Electronics and Communication Engineering

National Institute of Technology Karnataka Surathkal,
Karnataka,
India – 575025 May 2022
ABSTRACT
The aim of the project is to analyse the variations in potential of FINFET
SOI structure of a junctionless transistor in VISUAL TCAD.
In this work, I started with getting to know the tool and practiced for
simple NMOS. Using device drawing feature of visual tcad, I drew and
simulated NMOS structure and analyzed its potential profile. Then, I analyzed
Silicon on Insulator (SOI) MOSFET, FINFET Junctionless MOSFET and
conduction mechanism of it. After a basic understanding I designed and
analyzed Junctionless FINFET with SOI structure in 3D. The Junctionless
FINFET is a
analyze MIPS instruction format, instruction data path, control module
function and design theory based on RISC CPU instruction set. Furthermore I
use pipeline design process to simulate successfully, which involves instruction
fetch (IF), instruction decode (ID), execution (EX), data memory (MEM), write
back (WB) modules of the 32-bit CPU based on RISC CPU instruction set. IF
module fetches the instruction from instruction memory. ID stage sends control
commands i.e. instructions are sending to control unit and decoded here. EXE
stage executes arithmetic. Main component of the EXE stage is ALU. MEM
fetches data from memory and store data to memory, if instruction is not
memory/IO instruction, result is sent to WB stage. At last WB stage charges of
writing the results, store data and input data to register file. The purpose of WB
stage is to write data to destination register. To implement different hazard
resolution, forwarding and hazard detection by stalling the processor is involved
in this project. The idea of this project was to create a MIPS processor as a
building block in Verilog. In this project for simulation I used Xilinx Vivado
tool.
1 INTRODUCTION
This report describes the project on implementation of Design of MIPS
(Simple RISC) processor. Before explaining the fundamentals of MIPS, this
section focuses on basics of CPU. The CPU, also referred as a central
processing unit is the hardware design inside computer system which performs
based on the instructions given by a computer program. There are mainly two
different type popular processor, RISC and CISC processor.
1.1 RISC and CISC architecture

CISC stands for Complex Instruction Set Computer. As name suggests it
has a large amount of different multi-clock complex instructions. This type of
processor emphasis on hardware more than software and LOAD and STORE
instructions are incorporated and based on memory to memory transaction and.
CISC processor uses transistors for storing complex instructions. CISC
processors are relatively slow in comparison to RISC but at the same time it
used less number of instructions.
Against CISC architecture, RISC processors are faster. RISC processor
emphasizes on software more than hardware. These days CISC processors are
rarely in use. RISC uses simpler and faster instruction that is typically of size
one so theoretically it uses fewer transistors which make RISC processor easier
and less expensive to design. All operations performed on data apply to data in
registers and it changes the entire register so basically all the operations are
done on registers. The only operation that affect memory area load and store
instructions that move data to memory (st) and move from memory (ld) [1].
Here in this report our default RISC architecture is MIPS. There are three
different types of instructions. Here we implemented two types of instruction
only branch and jump type instructions are not implemented.
1. ALU instruction :
These kinds of instructions use either two registers or a sing extended
immediate and register.
Typical instructions are AND, OR, add, sub and etc.
2. Load and Store instructions:
Base register and offset are the operands for this type of instructions. The sum
of both base register and offset called as ‘effective address’ and this is being
used as a memory address.
At the time of LOAD instructions, a second register operand works as the
destinations register while in STORE instruction second register operand is the
source of the data that needs to be stored into memory.
1.2 Introduction to single cycle CPU, multi cycle CPU

and comparison with pipeline CPU
In order to understand how one can implement the RISC instruction set in
pipelined fashion, we should understand how it can be implemented without
pipelining and therefore here we will go through the basics of multi clock cycle
CPU approach. Definitely unpipelined implementation is not economical in
comparison to the pipelined CPU structure. We will understand this with the
help of an example later in this section.
In general, every instruction in RISC architecture can be implemented using 5
clk cycles. The multi clk cycles are as follow:
1. Instruction Fetch (IF)
Sending PC to memory and fetching the current instruction from memory
as well update the PC to next in sequence by adding 4 to the PC (PC =
PC+4).
2. Instruction decode (ID)
Decoding the instruction and reading the registers as specified in register
file. For the possible branch instruction, doing the equality test on the
registers as they are read. Sign extend the offset field if it is needed.
Compute the possible branch target address. Decoding can be done in
parallel with reading the registers since the register specifiers at a fixed
location, this is called is ‘fixed field decoding’.
3. Execute (EX)
In this stage, mainly ALU operations based on the instruction type. In
terms of memory instructions, it adds base address and offset to acquire
effective address. For register –register operations, as per the ALU –
opcode it performs addition, subtraction as it is needed. It performs
operation for register –immediate ALU instructions.
4. Memory access (MEM)

In this particular stage, load and store instructions are being performed. If
it is a load instruction then it reads an effective address from the memory
and in the case of store instruction it writes the data in to memory.
5. Write Back (WB)
This is the last stage and it performs register – register ALU instruction or
LOAD instruction to write the result in to register file (at ID stage), to
check whether it comes through load instruction or from ALU when it is
a case of ALU instruction.
1.2.1 Basic of Single Cycle CPU
As name suggests in this category of CPU, it executes all instructions in
one clk cycle. In reality each cycle requires a certain amount of time and
this mean single cycle CPU spends same amount of time to execute each
instruction, basically one cycle no matter how complex is the instruction.
In order to ensure the correct operation, the slowest instruction should be
completed within one clock tick e.g. load (ld), which means single cycle
CPU operates at the speed of slowest instruction in ISA. Another aspect
of this CPU is, since it has to complete all the instructions in one clock
cycle means any element must be used once only. So duplication of such
an element has to be available. This point to the fact that if same element
is used more than once than there will instruction flows and therefore
different connections have to be realized and the is done by multiplexer.
1.2.2 Basic of Multi cycle CPU
As name implies, the this kind of CPU requires multiple cycles to execute
each instruction, of course this means the CPI will be more than one in
this case. The advantage of such kind of CPU over single cycle CPU is
that depends upon the complexity of the instruction, more and less
number of clock cycles can be used, e.g. load instruction needs 5 stages in
comparison to 3 cycles for branch instruction. Since the complexity of
operation is increased, there must be a control unit and this can be
developed using Finite State Machine where as in the case of single cycle
CPU was multiplexers. Like Single Cycle CPU, now in this case since a
complexity is increased, total cycle time is determined by slowest
operation unit, e.g. memory.
1.2.3 Pipeline
“Pipelining is an implementation technique whereby multiple instructions
are overlapped in execution; it takes an advantage of parallelism that
exists among the actions needed to execute an instruction”. All recent
processors incorporate pipelining as a key implementation technique.
“Pipelining is an implementation technique whereby multiple instructions
are overlapped in execution; it takes an advantage of parallelism that
exists among the actions needed to execute an instruction”. All recent
processors incorporate pipelining as a key implementation technique.
MIPS is a five stage pipeline structure, each stage is responsible to
complete a part of an each instruction as explained in section 1.2. All
these five stages are connected through a pipelining register. The
throughput of the pipeline is determined by the consideration of a fact
that how often an instruction exits. As all the stages are connected, all of
them should be ready to perform at the same time. The time required to
move an instruction one step down to another stage among five stages
sequentially is known as ‘processor cycle’. The slowest pipeline stage
decides the length of the processor cycle. It is designer’s responsibility to
balance the length of processor cycle of each stage. Let us consider that
stages are balanced then the time for each instruction on processor can be
determined by the equation below:
Time per instruction on unpipelined machine
Number of pipeline stages
If we consider this condition then the speedup of the pipelining is same as

the number of the pipeline stages so it should be five in the case of MIPS
processor. In reality, these stages are not balanced accurately and pipeline
does have overhead mainly pipeline register delay and clk skew due to set
up time of these registers. Once the clock cycle is as small as pipeline
overhead then the pipeline concept is no more useful which means very
deep pipeline may not be useful. Always consider the fact that pipeline
reduces the average execution time per instruction.
Execution time of processor = CPI * Clock cycle time
Above equation depicts the fact that higher CPI does not mean faster
processor, also processor with a higher clk rate program slower.One can
design pipeline processor (MIPS here) explained in earlier section by
initializing new instruction at on every clk cycles. Here each clk cycle
means one of the stages of pipeline. Fig. below represents the typical
pipeline structure, even though an instruction takes five clocks to
complete the execution, hardware will start a new instruction and will
execute a part of the instruction at each stage.
3.0 MIPS instruction format

In MIPS, there are three different types of instructions: R-type. I-type and
J-type.
Instruction set definition

Type of
Name Description instructi
on
J Jump J
Lw load word I
Sw store word I
Bne branch not equal I
Beq branch equal I
Addi add immediate I
Ori Or immediate I
Add Addition R
Sub Subtraction R
Mult Multiplication R
Div Division R
And AND R
Or OR R
Nor NOR R
Above table is showing those instructions which are being supported by

this design, supporting documentation is shown in Appendix B which are
the waveform representation. Let us understand the different instructions
and their field operation.
R-type instructions
R-type instructions take three different arguments: rt and rs both source

register and rd – destination register.
For example,
add $r1, $r2, $r3 (instruction rd, rs, rt) which means it adds two values of
$r2 and $r3 and stores the result in to $r1.
I-type instructions
I-type instructions takes two arguments, rs and rt and 16 bit immediate

value, this immediate value is not stores in memory but it is a part of the
instruction. The benefit of such immediate is that we do not need to work
with the memory so accessing constant (immediate) is much faster.
For example,
addi $r1, $r2, 9 (instruction rt, rs, immediate) which means it adds the
value 5 to the register $r2, and stores the result in to $r1.
J-type instruction
J instructions are written with labels; it is linker or assembler’s duty to

convert the label in to numerical value.
For example,
j label (instruction addr), which means this instruction informs the

processor to skip to the instruction written at addr space.
3.1 A Pipeline Datapath and Control
Pipeline structure for MIPS architecture.

We have pipeline registers in between each stage of the pipeline. These
pipeline stages are named in such a way that it shows connection through it
from one stage to successive next stage. It is known that each operation
here must be complete in one clk cycle. We need these pipeline registers
because any operation that travels from one stage to another that needs to be
stored temporarily in correspondent pipeline register. Operations in each
stage of the pipeline structure are shown below.
Let us take a look on the operations that occur on every pipeline stages. In
IF (Instruction Fetch) It fetches the new instruction from instruction
memory and updates new PC (Program Counter) both in to pipeline
register as well PC. In ID (Instruction Decode), it fetches registers,
extends 16-bits of immediate field. In Ex (Execution), it performs all
ALU the operations, as well adds offset and base register (IR and B) to
calculate the effective address as well it adds immediate field to the A
register. During MEM (Memory), it cycles memory; write the Program
counter as well passes the values to the WB stage if it was a load
instruction. In WB (Write Back), it updates the register from either the
loaded value or ALU output.
Note the fact that, first two stages are independent to the current
instruction since instruction is not decoded until it reaches to the end of the
ID stage, First stage (IF) activity is dependent to the EX/MEM stage
since it has to take account the updated PC for branch taken/not taken at
the end of instruction fetch. To control this pipeline structure we have to
determine that where we need to keep multiplexer as per the options
available.
In order to specify the control signal of the pipeline structure, each stage of
the pipeline needs to be given control value. Here we can divide the
control signals in to five different groups since each control line is
correspondent to the active component of that particular pipeline stage as
shown in figure above.
To continue with five divisions, let us explain a little detail about each of
them.
1. Instruction Fetch
 There isn’t anything really in this division as control signal to
write the PC and read the IM (Instruction Memory) are
always there.
2. Instruction Decode
 Since even this stage is independent to the current instruction
type as explained earlier, every time same operation happens
at this stage.
3. Execution /ALU operation
 As shown in above figure, ALUSrc, ALUOp and RegDsr are
the signals that need to be set, it selects the ALU operation,
resulting register, and either sign extended immediate field or
read the data.
4. Memory
 Again as shown in above figure, in this stage, MemWrite,
MemRead and Branch are the signals that needs to be set,
they are set by the store instruction, load instruction or by the
branch equal respectively.
5. Write Back
 There are two different control signals; MemtoReg which is
responsible in deciding in between sending the memory value
or ALU result from stage 3 and RegWrite which is
responsible of writing the value.
4.0 Simulation in Xilinx Vivado
Instruction format
Instructions should be provided to the instruction memory in reset time. We
avoided the `readmemb` and `readmemh` functions to
keep the code synthesizable. The instruction memory cells are 8 bits long,
whereas each instruction is 32 bits long.
Therefore, each instruction takes up four memory cells, as shown bellow.
For example, an add instruction: `10000000001000000000000000001010` or
`Addi r1,r0,10` will need to be given as
```
instMem[0] <= 8'b10000000;
instMem[1] <= 8'b00100000;
instMem[2] <= 8'b00000000;
instMem[3] <= 8'b00001010;
```
For R-type.
Initial 6 bits contain opcode then next five bits contain the address of
destination register and then next ten bits will have address of two source
registers,then next five bits are for shamt, and last 6 bits are for function.
For I-type.
destination register and then next five bits will have address of source register
then we have last sixteen bit reserved for immediate value.
For Store/Load
destination register and then next five bits will have address of source register
then we have last sixteen bit reserved for address.
Instruction Memory.
Instruction Hexadecimal
Addi r1,r0,10 Add immediate r0&10 8020000A
and store in r1.
Add r2,r0,r1 Add register value of r0 04400800
&r1 and store in r2.
Sub r3,r0,r1 Subtract r1 from ro and 0C600800
store it in r3.
And r4,r2,r3 Bitwise And operation 14821800
between r2&r3 and store
in r4.
Ld r11,r1,0 Load value from memory 91610000
address generated from
r1+(offset value(i.e 0)) to
r1.
St r3,r1,4 Store value of r3 to 94610004
address(r1+(offset value
(i.e 4)))
Add r1,r0,r4 Add register value of r0 4405800
&r4 and store in r1.
And r4,r2,r3 Bitwise And operation 14821800
between r2&r3 and store
in r4.
5.0 Simulation Results
6.0 Conclusion
In this practice, I have successfully accomplished building a MIPS CPU
with pipeline functionalities. This design shows the implementation of MIPS
CPU capable of handling various R-type,I-type and store load type of
instruction and each of these categories has a different format. This project
shows the wide variety of logics to consider during the design.
7.0 Future Enhancement
Incorporating memory architecture by designing different CACHE
implementation technique could be helpful to understand the advance computer
architecture. Taking this design and dump it to IC Compiler to understand the
physical design fundamental can be a good way to learn whole ASIC flow.
8.0 References
1] Muskan Saxena, Ojaswini Nimbalkar, Vidhi Jaiswal, Vishakha Pandey, P.
Sanjeevi. The survey of concepts of architecture in RISC and CISC computers,
Volume-4, Issue-6, pp. 146-151, 2018.
[2] P. M. Kogge, The Architecture of Pipelined Computers, McGrawHill, 1981.
[3] M Design of 16-bit RISC Processor International Journal of Scientific
Research in Physics and Applied Sciences, 1(1), Feb 2017, pp 25-30.
[4] Mano, Ciletti. Digital Design with an Introduction to the Verilog HDL, 5th
Edition, Pearson Education, 2013.
[5] Palnitkar. Verilog HDL A Guide to Digital Design and Synthesis, 2nd
Edition, Sun Microsystems, 2003.
[6] Mano. Digital Logic and Computer Design, Pearson Education, 1979.

MIPS Report File

Uploaded by

Copyright:

Available Formats

MIPS Report File

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MIPS Report File

Uploaded by

Copyright:

Available Formats

M.

TECH (VLSI DESIGN)

Under the guidance of

Department of Electronics and Communication Engineering

1.1 RISC and CISC architecture

1.2 Introduction to single cycle CPU, multi cycle CPU

4. Memory access (MEM)

If we consider this condition then the speedup of the pipelining is same as

3.0 MIPS instruction format

Instruction set definition

Above table is showing those instructions which are being supported by

R-type instructions take three different arguments: rt and rs both source

I-type instructions takes two arguments, rs and rt and 16 bit immediate

J instructions are written with labels; it is linker or assembler’s duty to

j label (instruction addr), which means this instruction informs the

3.1 A Pipeline Datapath and Control

Pipeline structure for MIPS architecture.

You might also like