Computer Organization (17€S34) - Module: V Santhosh Kumar D K
Module V
Basic Processing Unit and Embedded Systems and Large
Computer Systems
SOME FUNDAMENTAL CONCEPTS
To execute an instruction, processor has to perform following 3 steps:
1. Fetch contents of memory-location pointed to by PC. Content of this location is an
instruction to be executed. The instructions are loaded into IR, Symbolically, this
operation can be written as
IR [[PC]
2. Increment PC by 4
PC— [PC] +4
3. Carry out the actions specified by instruction (in the IR).
Note: The first 2 steps are referred to as fetch phase;
Step 3 is referred to as execution phase.
SINGLE BUS ORGANIZATION
> MDR has 2 inputs and 2 outputs. Data may be loaded.
Y into MDR either from memory-bus (external) or
From processor-bus (internal).
> MAR"s input is connected to internal-bus, and MAR"s output is connected to
external-bus.
> Instruction-decoder & control-unit is responsible for
¥ Issuing the signals that control the operation of all the units inside the
processor (and for interacting with memory bus).
¥ implementing the actions specified by the instruction (loaded in the IR)
> Registers RO through R(n-1) are provided for general purpose use by programmer.
> Three registers Y, Z & TEMP are used by processor for temporary storage during
execution of some instructions, These are transparent to the programmer ice.
programmer need not be concerned with them because they are never referenced
explicitly by any instruction,
> MUX(Multiplexer) selects either
Y output of ¥ or
Y constant-value 4(is used to increment PC content).This is provided as input A of
ALU.
B input of ALU is obtained directly from processor-bus..
As instruction execution progresses, data are transferred from one register to another,
often passing through ALU to perform arithmetic or logic operation.
‘An instruction can be executed by performing one or more of the following operations:
1) Transfer a word of data from one processor-register to another or to the ALU,
2) Perform arithmetic or a logic operation and store the result in a processor-register.
v
¥
¥
Dept. of CSE, Canara Engineering College Page 120Computer Organization (17€S34) - Module: V Santhosh Kumar D K
3) Fetch the contents of a given memory-location and load them into a processor-
register.
4) Store a word of data from a processor-register into a given memory-location,
Figure 7.1 Single bus orgonizotion of the datopath inside o processor.
REGISTER TRANSFERS
> Instruction execution involves a sequence of steps in which data are transferred from
one register to another.
> Input & output of register Ri is connected to bus via switches controlled by 2 control-
signals: Riiy & Rigy. These are called gating signals.
When Ri,,=1, data on bus is loaded into Ri.
Similarly, when Ripu=1, content of Ri is placed on bus.
When Riou=0, bus can be used for transferring data from other registe!
> All operations and data transfers within the processor take place within time-periods
defined by the processor-clock.
When edge-triggered flip-flops are not used, 2 or more clock-signals may be needed to
guarantee proper transfer of data. This is known as multiphase clocking.
v
y
Vv
Dept. of CSE, Canara Engineering College Page 121Computer Organization (17€S34) - Module: V Santhosh Kumar D K
Input & Output Gating for one Register Bit
> A 2-input multiplexer is used to select the data applied to the input of an edge-
triggered D flip-flop.
> When Rij,=2, mux selects data on bus. This data will be loaded into flip-flop at rising-
edge of clocl
When Riin=0, mux feeds back the value currently stored in flip-flop.
> Q output of flip-flop is connected to bus via a tri-state gate,
When Riyw=0, gate’s output is in the high-impedance state. (This corresponds to the
open-circuit state of a switch).
When Riju=2, the gate drives the bus to 0 or 1, depending on the value of Q
mer er
oa
ve
P=]
Seca Sax
te
Cz]
Fare 72 pt and oo ig gin Fg 73. pt ond exp gang fron region bt
sae
PERFORMING AN ARITHMETIC OR LOGIC OPERATION
> The ALU performs arithmetic operations on the 2 operands applied to its A and B
inputs.
Dept. of CSE, Canara Engineering College Page 122Computer Organization (17€S34) - Module: V Santhosh Kumar D K
> One of the operands is output of MUX & the other operand is obtained directly from
bus.
> The result (produced by the ALU) is stored temporarily in register Z.
The sequence of operations for [R3]—[R1]+[R2] is as follows
1) Rlout, Yin /heansfer the contents of R1 to Y register
11R2 contents are transferred directly to B input of
2) R2ou, Select, Add, Zin ALU.
11 The numbers of added. Sum stored in register Z
3) Zout, R3in sum is transferred to register R3
The signals are activated for the duration of the clock cycle corresponding to that step.
All other si
v
nals are inactive.
Write the complete control sequence for the instruction : Move (Ry),Ra
>» This
is a memory read operation. This requires the following actions
Y fetch the instruction
truction copies the contents of memory-location pointed to by R, into Ry. This
¥ fetch the operand (ie. the contents of the memory-location pointed by R.).
Y transfer the data to Ry.
The control-sequence is written as follows
1. PCow MARin, Read, Select4, Add, Zin
2. Zou PCine Yin WMFC
3. MDRouw IRin
4. Ry, MARin, Read
J. MDRin, WMFC
6, MDRou Ra, End
FETCHING A WORD FROM MEMORY
> To fetch instruction/data from memory, processor transfers required address to MAR
(whose output is connected to address-lines of memory-bus).
At the same time, processor issues Read signal on controblines of memory-bus.
> When requested-data are received from memory, they are stored in MDR. From MDR,
they are transferred to other registers
> MFC (Memory Function Completed): Addressed-device sets MFC to | to indicate that
the contents of the specified location
Y have been read &
Y are available on data-lines of memory-bus
Consider the instruction Move (R1),R2. The sequence of steps is:
‘sired address is loaded into MAR & Read command is issued
load MDR from memory bus & Wait for MFC response from
2) MDRige, WMFC_— memory
Dept. of CSE, Canara Engineering College Page 123Computer Organization (17€S34) - Module: V Santhosh Kumar D K
3) MDRui, R2in sload R2 from MDR
where WMFC=control signal that causes processor's control circuitry to wait for
nv meen proces
=e ene a
Figo 7.5 Timing of « memory Reod operation
arrival of MFC signal
Storing a Word in Memory
* Consider the instruction Move R2,(R1). This requires the following sequence:
1) Riou, MARig desired address is loaded into MAR
R2o MDRin, data to be written are loaded into MDR & Write command is
2) Write issued
3) MPR; WMFC _ ;load data into memory location pointed by R1 from MDR
EXECUTION OF A COMPLETE INSTRUCTION
+ Consider the instruction Add (R3),RI which adds the contents of a memory-location pointed
by R3 to register R1. Executing this instruction requires the following actions:
1) Fetch the instruction
Dept. of CSE, Canara Engineering College Page 124Computer Organization (17€S34) - Module: V Santhosh Kumar D K
+ Control sequence for execution of this i
2) Fetch the first operand.
3) Perform the addition.
4) Load the result into RI.
struction is as follows
1) PCy, MARin, Read, Select4, Add, Zin
2) Zeaty PCins Yin WMEC
3) MDRout, [Ria
4) R3.u, MARin, Read
5) Route Vine WMFC
6) MDRow, SelectY, Add, Zin
T) Zoaty Ring End
+ Instruction execution proceeds as follows:
Step 1. The instruction-fetch operation is initiated by loading contents of PC into MAR
& sending a Read request to memory. The Select signal is set to Select4, which causes
the Mux to select constant 4. This value is added to operand at input B (PC's content),
and the result is stored in Z
Step 2. Updated value in Z is moved to PC.
Step 3. Fetched instruction is moved into MDR and then to IR.
Step 4. Contents of R3 are loaded into MAR & a memory read signal is issued
Step 5. Contents of RI are transferred to Y to prepare for addition.
Step 6. When Read operation is completed, memory-operand is available in MDR, and
the addition is performed.
Step 7. Sum is stored in Z, and then transferred to RI.The End signal causes a new
instruction fetch cycle to begin by returning to step!
Branching Instructions
>
Control sequence for an unconditional branch instruction is as follows:
1) PCius, MARiy, Read, Select4, Add, Zin
2) Zoats PCins Vine WMFC
3) MDRou, IRin
4) Offset-field-of-TRou, Add, Zin
5) Zeats PCins End
> The processing starts, as usual, the fetch phase ends in step3.
> Instep 4, the offset-value is extracted from IR by instruction-decoding circuit.
> Since the updated value of PC is already available in register Y, the offset X is gated
onto the bus, and an addition operation is performed.
> Instep 5, the result, which is the branch-address, is loaded into the PC.
> The offset X used in a branch instruction is usually the difference between the branch
target-address and the address immediately following the branch instruction, (For
example, if the branch instruction is at location 1000 and branch target-address is
1200, then the value of X must be 196, since the PC will be containing the address
1004 after fetching the instruction at location 1000).
Dept. of CSE, Canara Engineering College Page 125,Computer Organization (17€S34) - Module: V Santhosh Kumar D K
> In case of conditional branch, we need to check the status of the condition-codes
before loading a new value into the PC.
€.g.: Offiet-field-of-IRou, Add, Zin, If N=0 then End.
), processor returns to step | immediately after step 4,
. step 5 is performed to load a new value into PC.
Note:
To execute instructions, the processor must have some means of generating the control signals
needed in the proper sequence. There are two approaches for this purpose:
1) Hardwired control and
2) Microprogrammed control.
HARDWIRED CONTROL
control
> Decoder/encoder block is a combinational-circuit that generates req
‘outputs depending on state of all its inputs.
> Step-decoder provides a separate signal line for each step in the control sequence.
> Similarly, output of instruction-decoder consists of a separate line for each machine
instruction.
Dept. of CSE, Canara Engineering College Page 126Computer Organization (17€S34) - Module: V Santhosh Kumar D K
> For any instruction loaded in IR, one of the output-lines INS; through INS, is set to 1,
and all other lines are set to 0.
> The input signals to encoder-block are combined to generate the individual control
signals Yig, PCyyy Add, End and so on.
For example,
Zy=T/+ToADD+TBR This signal is asserted during time-slot Ty for all instructions,
during Te for an Add instruction during Ts for unconditional branch instruction
> When RUNS, counter is incremented by 1 at the end of every clock cycle. When
RUN=0, counter stops counting,
> Sequence of operations carried out by this machine is determined by wiring of logic
elements, hence the name “hardwired”.
Advantage:
Can operate at high speed.
\dvantage:
Dept. of CSE, Canara Engineering College Page 127Computer Organization (17€S34) - Module: V Santhosh Kumar D K
Limited flexibility.
her}
[ sep aeot]
thet
Fipre 711 Soprton he acing ond ead con.
rs)
by
Figure 712 Govern ch 2 coo sg forthe
proeaneria gee 7.1.
COMPLETE PROCESSOR
> This has separate processing-units to deal with integer data and floating-point data.
> A data-cache is inserted between these processing-units & main-memory.
> Instruction-unit fetches instructions
¥ from an instruetion-cache or
Dept. of CSE, Canara Engineering College Page 128Computer Organizat
n (17€S34) - Module: V Santhosh Kumar D K
Y from main-memory when desired instructions are not already in cache
Processor is connected to system-bus & hence to the rest of the computer by means of
a bus interface
Using separate caches for instructions & data
today.
‘A processor may include several unit
concurrent operations.
common practice in many processors
of each type to increase the potential for
Figure 7.14 Block diogrom of o complete processor.
MICROPROGRAMMED CONTROL
vs
Control-signals are generated by a program similar to machine language programs.
Control Word (CW) is a word whose individual bits represent various control-signals
(like Add, End, Zn). {Each of the control-steps in control sequence of an instruction
defines a unique combination of 1s & Os in the CW}.
Individual control-words in micro-routine are referred to as microinstructions.
‘A sequence of CWs corresponding to control-sequence of a machine instruction
constitutes the micro-routine.
‘The micro-routines for all instructions in the instruction-set of a computer are stored in
a special memory called the Control Store (CS).
Control-unit generates control-signals for any i
truction by sequentially reading CWs
of corresponding micro-routine from CS.
Micro-program counter(uPC) is used to read CWs sequentially from CS.
Every time a new instruction is loaded into IR, output of "starting address generator” is
loaded into pPC.
Dept. of CSE, Canara Engineering College Page 129Computer Organization (17€S34) - Module: V Santhosh Kumar D K
> Then, PC is automatically incremented by clock, causing successive
microinstructions to be read from CS.
Hence, control-signals are delivered to various parts of processor in correct sequence,
Telaigial | ylalslal¥ls..
Misstion| [ef SE [Fiat if ia] 2/3
r Jortlats Te[ete[elolo
ERE HBRAR
° e olojo Jojo ojololo:
ORGANIZATION OF MICROPROGRAMMED CONTROL UNIT (TO SUPPORT
CONDITIONAL BRANCHING)
> In case of conditional branching, microinstructions specify which of the external
inputs; condition-codes should be checked as a condition for branching to take place.
The starting and branch address generator block loads a new address into 1PC when
a microinstruction instructs it to do so.
> To allow implementation of a conditional branch, inputs to this block consist of
¥ external inputs and condition-codes
v
¥ contents of IR
> UPC is incremented every time a new microinstruction is fetched from microprogram
‘memory except in following situations
i, When a new instruction is loaded into IR, PC is loaded with starting-
address of micro-routine for that instruction.
ii, When a Branch microinstruction is encountered and branch condition is,
satisfied, PC is loaded with branch-address.
Dept. of CSE, Canara Engineering College Page 130Computer Organization (17€S34) - Module: V Santhosh Kumar D K
iii, When an End microinstruction is encountered, HPC is loaded with
address of first CW in micro-routine for instruction fetch cycle.
Figure 7.18 Organization of the control unit to allow
conditional Branching in the microprogrom.
MICROINSTRUCTIONS
> Drawbacks of micro-programmed control:
1, Assigning individual bits to each control-signal results in long microinstructions
because the number of required signals is usually large.
2. Available bit-space is poorly used because only a few bits are set to 1 in any given
microinstruction.
> Solution: Signals can be grouped because
1. Most signals are not needed simultaneously.
2. Many signals are mutually exclusive,
Grouping control-signals into fields requires a little more hardware because decoding-
circuits must be used to decode bit patterns of each field into individual control
signals.
v
Advantage:
Dept. of CSE, Canara Engineering College Page 131Computer Organization (17€S34) - Module: V
‘This method results in a smaller control-store (only 20 bits are needed to store the patterns
for the 42 signals).
Vertical organization
Horizontal organization
1. Highly encoded schemes that use
compact codes to specify only a small
number of control functions in each
microinstruction ate referred to as a vertical
organization
2.This approach results in considerably
slower operating speed because more micro-
instructions are needed to perform the
‘The minimally encoded scheme in which
many resources can be controlled with a
single micro-instruetion is called a
horizontal organization
This approach is useful when a higher
operating speed is desired and when the
machine structure allows parallel use of
Santhosh Kumar D K
desired control functions resources
Microinstruction
Fi @ bie) FG bie
Ot: Rees
10: Write
MICROPROGRAM SEQUENCING
> Two major disadvantage of micro-programmed control is:
Y Having a separate micro-routine for each machine instruction results in a large
total number of microinstructions and a large control-store.
Y Execution time is longer because it takes more time to carry out the required
branches.
> Consider the instruction Add src,Rdst :which adds the source-operand to the contents
of Rast and places the sum in Rdst.
Dept. of CSE, Canara Engineering College Page 132Computer Organization (17€S34) - Module: V Santhosh Kumar D K
> Let source-operand can be specified in following addressing modes: register,
autoincrement, autodecrement and indexed as well as the indirect forms of these 4
modes.
> Each box in the chart corresponds to a
operations indicated within the box.
> The microinstruction is located at the address indicated by the octal number (001,002).
icroinstruction that controls the transfers and.
BRANCH ADDRESS MODIFICATION USING BIT-ORING
> Consider the point labeled « in the figure. At this poi
between direct and indirect addressing modes.
> If indirect-mode is specified in the instruction, then the microinstruction in location
170 is performed to fetch the operand from the memory.
> If direct-mode is specified, this fetch must be bypassed by branching immediately to
location 171
> The most efficient way to bypass microinstruction 170 is to have the preceding branch
microinstructions specify the address 170 and then use an OR gate to change the LSB
necessary to choose
Dept. of CSE, Canara Engineering College Page 133,Computer Organization (17€S34) - Module: V Santhosh Kumar D K
of this address to 1 if the direct addres
ORing technique.
WIDE BRANCH ADDRESSING
ing mode is involved. This is known as the bit-
> The instruction-decoder (InstDec) generates the starting-address of the microroutine
that implements the instruction that has just been loaded into the IR.
> Here, register IR contains the Add instruction, for which the instruction decoder
generates the microinstruction address 101. (However, this address cannot be loaded
as is into the HPC).
> The source-operand can be specified in any of several addressing-modes. The bit-
ORing technique can be used to modify the starting-address generated by the
instruction-decoder to reach the appropriate path.
Use of WMFC
> WMEC signal is issued at location 112 which causes a branch to the microinstruction
in location 171.
> WMEC signal means that the microinstruction may take several clock cycles to
complete. If the branch is allowed to happen in the first clock cycle, the
microinstruction at location 171 would be fetched and executed prematurely.
> To avoid this problem, WMEC signal must inhibit any change in the contents of the
HPC during the waiting-period.
Detailed Examination
> Consider Add (Rsrc)+,Rdst; which adds Rsre content to Rast content, then stores the
sum in Rdst and finally increments Rsre by 4 (i.e. auto-inerement mode).
> In bit 10 and 9, bit-patterns 11, 10, 01 and 00 denote indexed, auto-decrement, auto-
increment and register modes respectively. For each of these modes, bit 8 is used to
specify the indirect version.
> The processor has 16 registers that can be used for addressing purposes; each sp.
using a 4-bit-code.
There are 2 stages of decoding:
1. The micre
register i
2. The decoded output is then used to gate the contents of the Rsre or Rdst fields
in the IR into a second decoder, which produces the gating-signals for the
actual registers RO to R1S.
ified
struction field must be decoded to determine that an Rsre or Rdst
involved.
MICROINSTRUCTIONS WITH NEXT-ADDRESS FIELDS
> The micro-program requires several branch microinstructions which perform no useful
operation. Thus, they detract from the operating speed of the computer.
> Solution: Include an address-field as a part of every microinstruction to indicate the
location of the next microinstruction to be fetched. (This means every microinstruction
becomes a branch microinstruction).
Dept. of CSE, Canara Engineering College Page 134Computer Organization (17€S34) - Module: V Santhosh Kumar D K
> The flexibility of this approach comes at the expense of additional bits for the address-
field.
> Advantage: Separate branch microinstructions are virtually eliminated. There are few
imitations in assigning addresses to microinstructions. There is no need for a counter
to keep track of sequential addresses. Hence, the PC is replaced with a AR
(Microinstruction Address Register). (which is loaded from the next-address field in
each microinstruction}.
_
oe
en ee STS]
iI 10
000 PC yun MAR,q, Read, Selects, Add, Zing
oor Zee PCr Yine WMC
002 MDRae [Rog
003 Branch (PC < 101 (from Instruction decoder);
BPCs4@ [IRjogI; HPCs < (Rio) - {ER5} - URI)
121 Rte MAR, Read, Sclect4, Add, Zy
122 Zo RSC
123 Branch (HPC <- 170; HPCo «~ [IRg]). WMFC
170 MDR.... MAR,,, Read, WMFC
m MD Rove Yon
m Rast. Select, Add, Zi,
173 End
Figure 7.21 Microinstruction for Add (Rsrc)+,Rdst
> The next-address bits are fed through the OR gate to the AR, so that the address ean
be modified on the basis of the data in the IR, external inputs and condition-codes.
> The decoding circuits generate the starting-address of a given microroutine on the
basis of the opcode in the IR.
PREFETCHING MICROINSTRUCTIONS
> Drawback of micro-programmed control: Slower operating speed because of the time
it takes to fetch microinstructions from the controFstore.
» Solution: Faster operation is achieved if the next microinstruction is pre-fetched while
the current one is being executed.
Emulation
Dept. of CSE, Canara Engineering College Page 135,Computer Organization (17€S34) - Module: V Santhosh Kumar D K
v
v
v
.
Dept.
The main function of micro-programmed control is to provide a means for simple,
flexible and relatively inexpensive execution of machine instruction.
Its flexibility in using a machine's resources allows diverse classes of instructions to be
implemented.
Suppose we add to the instruction-repository of a given computer M1, an entirely new
set of instructions that is in fact the instruction-set of a different computer M2.
Programs written in the machine language of M2 can be then be run on computer MI
ie. M1 emulates M2.
Emulation allows us to replace obsolete equipment with more up-to-date machines,
If the replacement computer fully emulates the original one, then no software changes
have to be made to run existing programs.
Emulation is easiest when the machines involved have similar architectures.
O11 Za, on 2 ou reer,
100: Ritome 100: ce, 100: Yn
eam aM FI0G bm
T tmbee * Ome * Oman
Gage 72 Pesan fer elercmumvatons ta Go emmeln cf ecten 7.5.2.Computer Organization (17€S34) - Module: V Santhosh Kumar D K
mo jain
‘00] eoonceosleosfors ojpeedlen|
001 oovoeotejotsfoei|toejeeoe oe
902/00000011/010)010 }000/0000100
203 |ov0eon0a|o0a|eve|oveleoeo|ae
170/01 ‘000/001 000,01)
171/01111010)010|000|100'0000\00.
rozlotiitoiy 101/011/000:0000100'0
173} 00000000/011/101/000)0000)00;0 0
7.26 Inglanaison micron c Fgr 721 ving
- moter ee
sor
EMBEDDED SYSTEMS
Examples of Embedded Systems
Microwave Oven
This appliance is based on a magnetron power unit that generates the microwaves used to heat
food in a confined space. When turned on, the magnetron generates its maximum power
output. Lower power levels are achieved by turning the magnetron on and off for controlled
time intervals. By controlling the power level and the total heating time, it is possible to
realize a variety of user-selectable cooking options.
The specification for a microwave oven may include the following cooking options:
Manual selection of the power level and cooking time
Manual selection of the sequence of different cooking steps
Automatic operation, where the user specifies the type of food (for example, meat,
vegetables, or popcorn) and the weight of the food; then an appropriate power
level and time are calculated by the controller
> Automatic defrosting of food by specifying the weight
The oven includes a display that can show:
> Time-of-day clock
> Decrementing clock timer while cooking
> Information messages to the user
‘An audio alert signal, in the form of a beep tone, is used to indicate the end of a cooking
operation. An exhaust fan and oven light are provided. As a safety measure, a door interlock
v
vv
Dept. of CSE, Canara Engineering College Page 137Computer Organizat
n (17€S34) - Module: V Santhosh Kumar D K
tums the magnetron off if the door of the oven is open, All of these functions can be
controlled by a microcontroller.
The input/output capability needed to communicate with the user include:
> Input keys that comprise a0 to 9 number pad and function keys such as Reset,
Start,
> Stop, Power Level, Auto Defrost, Auto Cooking, Clock Set, and Fan Control
> Visual output in the form of a liquid-crystal display, similar to the seven-segment
display.
They include maintaining the time-of-day clock, determining the actions needed for the
various cooking options, generating the control signals needed to turn on or off devices such
as the magnetron and the fan, and generating display information. The program needed to
implement the desired actions is quite small. Its stored in a nonvolatile read-only memory, so
that it will not be lost when the power is turned off. It is also necessary to have a small RAM
for use during computations and to hold the user-entered data,
> The most significant requirement for the microcontroller is to have sufficient /O
capability for all of the input keys, displays, and output control signals.
> Parallel I/O ports provide a convenient mechanism for dealing with the external input
and output signal
Figure 10.1 shows a possible organization of the microwave oven, A simple processor with
small ROM and RAM units is sufficient. Basic input and output interfaces are used to connect
to the rest of the system. It is possible to realize most of t
microcontroller chip.
S circuitry on a small
eS
A=
Figure 10.1 block dogram of a microwne oven
[}
Digital Camera
Dept. of CSE, Canara Engineering College Page 138,Computer Organization (17€S34) - Module: V Santhosh Kumar D K
>
v
Ina digital camera, an array of optical sensors is used to capture images. These sensors
convert light into electrical charge. The intensity of light determines the amount of
charge that is generated.
‘Two different types of sensors are used in commercial products. One of such kind is
Charge-coupled devices (CCDs
I is the type of sensing device used in the earliest digital cameras. It has since been
refined to give high-quality images. More recently, sensors based on CMOS
technology have been developed.
Each sensing element generates a charge that corresponds to one pixel, which is one
point of a pictorial image. The number of pixels determines the quality of pictures that
can be recorded and displayed.
The charge is an analog quantity, which is converted into a digital representation using
analog-to-digital (A/D) conversion circuits. A/D conversion produc:
a digital representation
of the image in which the color and intensity of each pixel are represented by a number of
bits.
> The processor and system controller block in Figure 10.2 includes a variety of
interface circuits needed to connect to other parts of the system.
‘The main formats used are TIFF (Tagged Image File Format) for uncompressed
images and JPEG (Joint Photographic Experts Group) for compressed images,
A captured and processed image can be displayed on a liquid-crystal display
(LCD) screen, which is included in the camera.
‘A standard interface provides a mechanism for transferring the images to a
computer or a printer. Typically, this is done using a USB cable. If Flash memory
cards are used, images can also be transferred by physically transferring the card.
> The system controller generates the signals needed to control the operation of the
focusing mechanism and the flash unit. Some of the inputs come from switches
activated by the user.
>
v
Dept. of CSE, Canara Engineering College Page 139Computer Organizat Santhosh Kumar D K
Figure 10.2 Amplified block dagrom of digted camera.
A digital camera requires a considerably more powerful processor than is needed for the
previously discussed microwave oven application. The processor has to perform complex
signal processing functions.
Home Telemetry
A telephone with an embedded microcontroller can be used to provide remote access to other
devices in the home.
Using the telephone one can remotely perform functions such as:
> Communicate with a computer-controlled home security system
Set a desired temperature to be maintained by a furnace or an air conditioner
Set the start time, the cooking time, and the temperature for food that has been
placed in the oven at some earlier time
> Read the electricity, gas, and water meters, replacing the need for the utility
companies to send an employee to the home to read the meters
All of this is easily implementable if each of these devices is controlled by a microcontroller.
These devices should be connected, either wired or wireless, between the device
microcontroller and the microprocessor in the telephone. Using signaling from a remote
location to observe and control the state of equipment is often referred to as telemetry.
vv
Microcontroller Chips for Embedded Applications
> A microcontroller chip should be versatile enough to serve a wide variety of
applications. Figure 10.3 shows the block diagram of a typical chip.
> The main part is a processor core, which may be a basic version of a commercially
available microprocessor.
Dept. of CSE, Canara Engineering College Page 140Computer Organization (17€S34) - Module: V Santhosh Kumar D K
> It is useful to include some memory on the chip,
requirements found in small applications. Some of this memory has to be of RAM type
to hold the data that change during computations. Some should be of the read-only
type to hold the sofiware. To allow cost-effective use in low-volume applications, it is
necessary to have a field-programmable type of ROM storage. Popular choices for
realization of this storage are EEPROM and Flash memory.
Toone oI
igure 10.3 A block dagrom of a microcontroller
¥
Several /O ports are usually provided for both parallel and serial interfaces, which
allow easy implementation of standard I/O connections.
> In many applications, it is necessary to generate control signals at programmable time
intervals. This task is achieved easily if a timer circuit is included in the
microcontroller chip. Since the timer is a circuit that counts clock pulses, it can also be
used for event-counting purposes.
> Anembedded system may include some analog devices. To deal with such devices, it
is necessary to be able to convert analog signals into digital representations, and vice
versa, This is conveniently accomplished if the embedded controller includes A/D and
D/A conversion circuits.
Many embedded processor chips are available commercially. Some of the better
known examples are: Freescale’s 68HC11 and 68K/ColdFire families, Intel's 8051 and
MCS-96 families
v
A Simple Microcontroller
> The input/output structure of a microcontroller has to be flexible enough to
accommodate the needs of different applications and make good use of the pins
available on the chip,
> Figure 10.4 gives its block diagram, There is a processor core and some on-chip
memory. There are two 8-bit parallel interfaces, called port A and port B, and one
serial interface,
Dept. of CSE, Canara Engineering College Page 141Computer Organization (17€S34) - Module: V Santhosh Kumar D K
>
Paral
v
The microcontroller also contains a 32-bit counter/timer circuit, which can be used to
generate internal interrupts at programmed time intervals, to serve as a system timer,
to count the pulses on an input line, to generate square-wave output signals, and so on.
_ La
mS J—— Receive dats
oo =
igure 10.4 An axample mcrocontaller
lel /O Interface
Each parallel port in Figure 10.4 has an associated eight-bit data direction register,
which can be used (o configure individual data lines as either input or output
Figure 10.5 illustrates the bidirectional control for one bit in port A. Port pin PA‘ is
treated as an input if the data direction flip-flop contains a 0. In this case, activation of
the control signal Read_Port places the logic value on the port pin onto the data line Di
of the processor bus. The port pin serves as an output if the data direction flip-flop is
set to 1. The value loaded into the output data flip-flop, under control of the
Write_Port signal, is placed on the pin. A versatile parallel interface may include two
possibilities:
‘Where input data are read directly from the pins, and
Where the input data are stored in a register.
Figure 10.5 Accs toon tn prt Ane 10.4
Dept. of CSE, Canara Engineering College Page 142Computer Organization (17€S34) - Module: V Santhosh Kumar D K
Figure 10.6 depicts all registers in the parallel interface, as well as the addresses assigned to
them,
Satter eee
Figure 10.6 Ford mera rar
The status register, PSTAT, contains the status flags.
‘The PASIN flag is set to 1 when there are new data on port A. It is cleared to 0 when
the processor accepts the data by reading the PAIN register,
The PASOUT flag is set to 1 when the data in register PAOUT are accepted by the
connected device, to indicate that the processor may now load new data into PAOUT.
The interface uses a separate control line (described below) to signal the availability of
new data to the connected device. The PASOUT flag is cleared to 0 when the
processor writes data into PAOUT.
The flags PBSIN and PBSOUT perform the same function for port B.
An interrupt flag IAIN, is set to 1 when that interrupt is enabled and the corresponding
VO action occurs. The interrupt-enable bits are held in control register PCONT. An
enable bit is set to 1 to enable the corresponding interrupt.
For example, if ENAIN=1 and PASIN=1, then the interrupt flag IAIN is set to 1 and
an interrupt request is raised. Thus, IAIN = ENAIN - PASIN
Port A has two control lines, CAIN and CAOUT, which can be used to provide an
automatic signaling mechanism between the interface and the attached device, for
devices that have this capability.
For an input transfer, the device places new data on the port's pins and signifies this
action by activating the CAIN fine for one clock cycle. When the interface circui
sees
Dept. of CSE, Canara Engineering College Page 143,Computer Organization (17€S34) - Module: V Santhosh Kumar D K
CAIN = 1, it sets the status bit PASIN to I. Later, this bit is cleared to 0 when the
processor reads the input data.
This action also causes the interface to send a pulse on the CAOUT line to inform the
device that it may send new data to the interface. For an output transfer, the processor
writes the data into the PAOUT register. The interface responds by clearing the
PASOUT bit to 0 and sending a pulse on the CAOUT line to inform the device that
new data are available. When the device accepts the data, it sends a pulse on the CAIN
line, which in turn sets PASOUT to 1
Control register bits PAREG and PBREG are used to select the mode of operation of
inputs to ports A and B, respectively. If set to 1, a register is used to store the input
data; otherwise, a direct path from the pins is used.
v
Vv
Serial /O Interface
v
The serial interface provides the ~UART (Universal Asynchronous
Receiver/Transmitter) capability to transfer data based on the scheme.
Double buffering is used in both transmit and receive paths, as shown in Figure 10.7.
Such buffering is needed to handle bursts in I/O transfers correctly.
> Figure 10.8 shows the addressable registers of the serial interface. Input data are read.
from the 8-bit Receive buffer, and output data are loaded into the 8-bit Transmit
buffer.
> The status register, SSTAT, provides information about the current status of receive
and transmit units.
> Bit SSTATO is set to 1 when there are valid data in the receive buffer; it is cleared to 0
automatically upon a read access to the receive buffer. Bit SSTATI is set to 1 when
the transmit buffer is empty and can be loaded with new data.
> Bit SSTAT2 is set to 1 if an error occurs during the receive process. For example, an
error occurs if the character in the receive buffer is overwritten by a subsequently
received character before the first character is read by the processor. The status register
also contains the interrupt flags.
Bit STATA is set to 1 when the receive buffer becomes full and the receiver interrupt
is enabled.
> Similarly, SSTATS is set to 1 when the transmit buffer becomes empty and the
transmitter interrupt is enabled.
¥
v
Dept. of CSE, Canara Engineering College Page 144Computer Organization (17€S34) - Module: V Santhosh Kumar D K
Figure 10.7 Receive and ronan sructre ofthe sero interoxe:
> The control register, SCONT, is used to hold the interrupt-enable bits. Setting
SCONTG6-4 to I or 0 enables or disables the corresponding interrupts, respectively.
This register also indicates how the transmit clock is generated.
» The last register in the serial interface is the clock-divisor register, DIV. This 32-bit
register is associated with a counter circuit that divides down the system clock signal
to generate the serial transmission clock.
mmme TT] ] stenersran
troriemet CO pint
owe 108 senate
Counter/Timer
A 32-bit down-counter circuit is provided for use as either a counter or a timer. The basic
operation of the circuit involves loading a starting value into the counter, and then
decrementing the counter contents using either the internal system clock or an external clock
Dept. of CSE, Canara Engineering College Page 145,Computer Organizat
Santhosh Kumar D K
signal. The circuit can be programmed to raise an interrupt when the counter contents reach
zero.
vn
woe CECE] nee
Figure 10.9 Costar /Tne ages
Figure 10.9 shows the registers associated with the counter/timer circuit. The counter/timer
register, CNTM, can be loaded with an initial value, which is then transferred into the counter
circuit.
Counter Mode
> The counter mode is selected by setting bit CTCON7 to 0. The starting value is loaded
into the counter by writing it into register CNTM. The counting process begins when
bit CTCOND is set to 1 by a program instruction.
> Once counting starts, bit CTCONO is automatically cleared to 0. The counter is
decremented by pulses on the Counter_in line in Figure 10.4, Upon reaching 0, the
counter circuit sets the status flag CTSTATO to 1, and raises an interrupt if the
corresponding interrupt-enable bit has been set to 1.
> The next clock pulse causes the counter to reload the starting value, which is held in
register CNTM, and counting continues. The counting process is stopped by setting bit
CTCONI to 1.
‘Timer Mode
> ‘The timer mode is selected by setting bit CTCON7 to 1. This mode can be used to
generate periodic interrupts. It is also suitable for generating a square-wave signal
on the output line Timer_out in Figure 10.4.
> The process starts as explained above for the counter mode. As the counter counts
down, the value on the output line is held constant. Upon reaching zero, the
counter is reloaded automatically with the starting value, and the output signal on
the line is inverted. Thus, the period of the output signal is twice the starting
counter value multiplied by the period of the controlling clock pulse. In the timer
mode, the counter is decremented by the system clock.
Interrupt-Control Mechanism
‘The processor in our example microcontroller has two interrupt-request inputs, IRQ and XRQ.
Dept. of CSE, Canara Engineering College Page 146Computer Organization (17€S34) - Module: V Santhosh Kumar D K
1. The IRQ input is used for interrupts raised by the VO interfaces within the
microcontroller.
2. The XRQ input is used for interrupts raised by external devices.
> If the IRQ input is asserted and interrupts are enabled, the processor executes an
interrupt-service routine that uses the polling method to determine the source(s) of
the interrupt request, This is done by examining the flags in the status registers
PSTAT, SSTAT, and CTSTAT.
> The XRQ interrupts have higher priority than the IRQ interrupts. The processor
status register, PSR, has two bits for enabling interrupts
A vectored interrupt scheme is used, with the vectors for IRQ and XRQ interrupts in memory
locations 0x20 and 0x24, respectively. Each vector contains the address of the first instruction
of the corresponding interrupt-service routine. This address is automatically loaded into the
program counter, PC.
THE STRUCTURE OF GENERAL-PURPOSE MULTIPROCESSORS
In three possible ways multiprocessor can be implemented.
1. Uniform Memory Access (UMA) Multiprocessor
2. Non-Uniform Memory Access (NUMA) Multiprocessor
3. A Distributed Memory System
Uniform Memory Access (UMA) Multiprocessor
.
Is one of the most obvious schemes shown in figure 12.2.
> Here an interconnection of network permits n processors to check k memories so
that any of the processors can access any memories.
The interconnection network will introduce a considerable delay between a
processor and memory, which is same for all memory accesses such organization
of machine is a called Uniform Memory Access (UMA) Multiprocessor.
‘This type of interconnection is costly and complex to build because of short delay.
v
Aa &”- ff
| I
|
| I |
Eel fl --- Ba
Non-Uniform Memory Access (NUMA) Multiprocessor
» This scheme allows a high computation rate to be sustained in all processors, is to
attach the memory modules directly to the processors as shown in figure 12.3.
Dept. of CSE, Canara Engineering College Page 147Computer Organizat
Santhosh Kumar D K
Figure 12.2. ANUMA multiproesor
> In addition to accessing its local memory, each processor can also access other
memories over the network, these access take considerably longer than accesses to the
local memory.
> Because of this difference in access times, such multiprocessors are called Non-
Uniform Memory Access (NUMA) Multiprocessor
A Distributed Memory System
This provides a global memory where any processor can access any memory
module without intervention by another processor, shown in figure 12.4,
> Here all modules serve as private memories for the processors that are directly
connected to them.
> A processor cannot access remote memory without the cooperation of the remote
processor.
> This cooperation takes place in the form of messages exchanged by processors,
such system are called Distributed Memory System with a message-passing
protocol,
fs
Figure 12.4 A distributed memory system.
PIPELINING
Basic Concepts
> Pipelining is a particularly effective way of organizing concurrent activity in a
computer system
> Let Fi and Bi refer to the fetch and execute steps for instruction Ti
> Execution of a program consists of a sequence of fetch and execute steps, as
shown below figure
Dept. of CSE, Canara Engineering College Page 148Santhosh Kumar D K
I L 5 ha I;
oa eo SS SF
[Fs[e [re [er]
Hardware Organization
> Consider a computer that has two separate hardware units
instructions and another for executing them, as shown below.
one for fetching
Interstage Buffer
Instruction fetch Execution
unit unit
Basic Idea of Instruction Pipelining
Time
Dept. of CSE, Canara Engineering College Page 149Computer Organization (17CS34) - Module: V Santhosh Kumar D K
A 4Stage Pipeline
Time
D: Decode]
F: Fetch instruction E: Execute W: Write
instruction & fetch ‘operation results
‘operands
Pipeline Performance
> The pipeline processor show in last slide completes the processing of one
instruction in each clock cycle, which means that the rate of instruction
processing is four times that of sequential operation,
to the number of pipeline stages.
>
‘The potential increase in performance resulting from pipelining is proportional
However, this increase would be achieved only if pipelined operation could be
sustained without interruption throughout program execution
Pipelined operation in above figure is said to have been stalled for two clock cycles.
Any condition that causes the pipeline to stall is called a hazard.
123.4 5 a Time
A | | | | |
4 t
| |
fpett
jf 1
A
Data Hazard and Instruction Hazard
> A data hazard is any condition in which either the source or the destination operands
of an instruction are not available at the time expected in the pipeline. As a result some
operation has to be delayed, and the pipeline stalls.
Dept. of CSE, Canara Engineering College
Page 150Computer Organization (17€S34) - Module: V Santhosh Kumar D K
> The pipeline may also be stalled because of a delay in the availability of an instruetion,
For example, this may be a result of a miss in the cache, requiring the instruction to be
fetched from the main memory. Such hazards are often called control hazards or
instruction hazards.
An Example of Instruction Hazard
Time
12.3 4 5 6 7 8 9 10
stage Time
rretch F1 Fo Fo Fo Fo Fs
rexeote___E1_idle_idte ile E> Es
ve Write W, idle idle idie W2 Wa
Structural Hazard
» Such idle periods shown in the last slide are called stalls. They are also often
referred to as bubbles in the pipeline. Once created as a result of a delay in one of
the pipeline stages, a bubble moves downstream until it reaches the last unit In
pipelined operation, when two instructions require the use of a given hardware
resource at the same time, the pipeline has a structural hazard.
> The most common case in which this hazard may arise is in access to memory.
One instruction may need to access memory as part of the Execute and Write stage
while another instruction is being fetched.
An Example of a Structural Hazard
> Load X(R1), R2
fj ue tas plot en =e 104 ai
OLE:
\etbeay
4 (eet
u TP tebe
in| Fo | Os] &. | Wz
Dept. of CSE, Canara Engineering College Page 151Computer Organization (17€S34) - Module: V Santhosh Kumar D K
Question Bank
1, With a diagram, explain typical single bus processor data path and write the sequence of
control steps to execute the instruction, ADD (R3), RI.
2. Explain with neat diagram, the basic organization of a micro-programmed control unit.
3. Differentiate hardwired & micro-programmed Control unit
4. Explain the processor of fetching a word from memory along with a timing diagram.
6. Explain the structure of general purpose multiprocessor.
7. Write down the control sequence for the instruction Add R4, R5, R6 for three-bus
organization.
8. Write a neat sketch; explain the organization of hardwired control unit.
9. With an example, explain the field coded micro instructions.
10. Explain the architecture of Simple Microcontroller in detail
11. Write a short note on a. Micro oven b. Digital camera.
12. Define Hazards. Explain different types of hazards with example.
Reference:
1. Carl Hamacher, Zvonko Vranesic, Safwat Zaky: Computer Organization, 5th Edition,
‘Tata McGraw Hill,2002.
For Softcopy of the notes and other study materials visit:
https://sites.google.com/view/dksbin/subjects/computer-organization
Dept. of CSE, Canara Engineering College Page 152