Logic Design
Logic Design
Logic Design
DESIGN FUNDAMENTALS
Examining Computer Hardware from the Bottom to the Top
David Tarnoff
This book was written by David L. Tarnoff who is also responsible for
the creation of all figures contained herein.
Printing History:
July 2005: First edition.
January 2006: Minor corrections to first edition.
July 2007: Added text on Gray code, DRAM technologies,
Mealy machines, XOR boolean rules, signed
BCD, and hard drive access times. Also made
minor corrections.
Legal Notice:
The 3Com® name is a registered trademark of the 3Com Corporation.
The Apple® name and iTunes® name are registered trademarks of
Apple Computer, Inc.
The Dell® name is a registered trademark of Dell, Inc.
The Intel® name, Pentium® 4 Processor Extreme Edition, Hyper-
Threading Technology™, and Hyper-Pipelined Technology™ are
registered trademarks of the Intel Corporation.
PowerPC® is a registered trademark of International Business Machines
Corporation.
The Microsoft® name is a registered trademark of the Microsoft
Corporation.
While every precaution has been taken to ensure that the material
contained in this book is accurate, the author assumes no responsibility
for errors or omissions, or for damage incurred as a result of using the
information contained in this book.
Preface................................................................................................ xxi
Chapter One: Digital Signals and Systems ........................................ 1
1.1 Should Software Engineers Worry About Hardware?............... 1
1.2 Non-Digital Signals.................................................................... 3
1.3 Digital Signals............................................................................ 4
1.4 Conversion Systems................................................................... 6
1.5 Representation of Digital Signals .............................................. 7
1.6 Types of Digital Signals............................................................. 9
1.6.1 Edges ................................................................................. 9
1.6.2 Pulses................................................................................. 9
1.6.3 Non-Periodic Pulse Trains .............................................. 10
1.6.4 Periodic Pulse Trains....................................................... 11
1.6.5 Pulse-Width Modulation ................................................. 13
1.7 Unit Prefixes ............................................................................ 15
1.8 What's Next? ............................................................................ 16
Problems......................................................................................... 16
Chapter Two: Numbering Systems .................................................. 17
2.1 Unsigned Binary Counting....................................................... 17
2.2 Binary Terminology................................................................. 20
2.3 Unsigned Binary to Decimal Conversion ................................ 20
2.4 Decimal to Unsigned Binary Conversion ................................ 23
2.5 Binary Representation of Analog Values................................. 25
2.6 Sampling Theory...................................................................... 31
2.7 Hexadecimal Representation.................................................... 34
2.8 Binary Coded Decimal............................................................. 36
2.9 Gray Codes............................................................................... 37
2.10 What's Next? .......................................................................... 40
Problems......................................................................................... 41
Chapter Three: Binary Math and Signed Representations ........... 43
3.1 Binary Addition........................................................................ 43
3.2 Binary Subtraction ................................................................... 45
3.3 Binary Complements................................................................ 46
3.3.1 One's Complement .......................................................... 46
3.3.2 Two's Complement.......................................................... 47
3.3.3 Most Significant Bit as a Sign Indicator ......................... 50
3.3.4 Signed Magnitude ........................................................... 51
v
vi Computer Organization and Design Fundamentals
TABLE OF FIGURES
1-1 Sample Digital System ............................................................... 3
1-2 Continuous Analog Signal with Infinite Resolution .................. 4
1-3 Sample of Discrete Measurements Taken Every 0.1 Sec........... 4
1-4 Samples Taken of an Analog Signal .......................................... 5
1-5 Slow Sampling Rate Missed an Anomaly.................................. 5
1-6 Poor Resolution Resulting in an Inaccurate Measurement ........ 5
1-7 Block Diagram of a System to Capture Analog Data ................ 6
1-8 Representation of a Single Binary Signal .................................. 8
1-9 Representation of Multiple Digital Signals................................ 8
1-10 Alternate Representation of Multiple Digital Signals ................ 9
1-11 Digital Transition Definitions .................................................. 10
1-12 Pulse Waveforms ..................................................................... 10
1-13 Non-Periodic Pulse Train ......................................................... 10
1-14 Periodic Pulse Train ................................................................. 11
1-15 Periodic Pulse Train with Different Pulse Widths ................... 11
1-16 Periodic Pulse Train with 25% Duty Cycle ............................. 13
2-1 Counting in Decimal ................................................................ 17
xii Computer Organization and Design Fundamentals
7-2 Mapping a 2-Input Truth Table to Its Karnaugh Map ........... 126
7-3 Three-Input Karnaugh Map ................................................... 127
7-4 Four-Input Karnaugh Map ..................................................... 127
7-5 Identifying the Products in a Karnaugh Map ......................... 130
7-6 Karnaugh Map with Four Adjacent Cells Containing '1' ....... 130
7-7 Sample Rectangle in a Three-Input Karnaugh Map............... 133
7-8 Karnaugh Map with a "Don't Care" Elements ....................... 138
7-9 Karnaugh Map with a "Don't Care" Elements Assigned ....... 138
8-1 Four Possible Results of Adding Two Bits............................ 141
8-2 Block Diagram of a Half Adder............................................. 142
8-3 Four Possible States of a Half Adder ..................................... 142
8-4 Logic Circuit for a Half Adder............................................... 143
8-5 Block Diagram of a Multi-bit Adder...................................... 144
8-6 Block Diagram of a Full Adder.............................................. 144
8-7 Sum and Carryout Karnaugh Maps for a Full Adder............. 145
8-8 Logic Circuit for a Full Adder ............................................... 146
8-9 Seven-Segment Display ......................................................... 147
8-10 Displaying a '1' with a 7-Segment Display ............................ 147
8-11 A Seven-Segment Display Displaying a Decimal '2'............. 148
8-12 Block Diagram of a Seven-Segment Display Driver ............. 148
8-13 Segment Patterns for all Hexadecimal Digits ........................ 149
8-14 Seven Segment Display Truth Table ..................................... 149
8-15 Karnaugh Map for Segment 'e'............................................... 150
8-16 Karnaugh Map for Segment 'e' with Rectangles .................... 150
8-17 Logic Circuit for Segment e of 7-Segment Display............... 151
8-18 Labeling Conventions for Active-Low Signals ..................... 152
8-19 Sample Circuit for Enabling a Microwave ............................ 153
8-20 Sample Circuit for Delivering a Soda .................................... 153
8-21 Truth Table to Enable a Device for A=1, B=1, & C=0.......... 154
8-22 Digital Circuit for a 1-of-4 Decoder ...................................... 154
8-23 Digital Circuit for an Active-Low 1-of-4 Decoder ................ 155
8-24 Truth Table for an Active-Low 1-of-8 Decoder .................... 155
8-25 Block Diagram of an Eight Channel Multiplexer .................. 156
8-26 Truth Table for an Eight Channel Multiplexer ...................... 156
8-27 Logic Circuit for a 1-Line-to-4-Line Demultiplexer.............. 158
8-28 Truth Table for a 1-Line-to-4-Line Demultiplexer ................ 159
8-29 Examples of Integrated Circuits............................................. 159
8-30 Pin-out of a Quad Dual-Input NAND Gate IC (7400)........... 160
8-31 Sample Pin 1 Identifications .................................................. 160
Table of Contents xv
12-8 IPv4 Address Divided into Subnet and Host IDs................... 254
12-9 Sample Chip Select Circuit for a Memory Device................. 256
12-10 Some Types of Memory Mapped I/O Configurations ........... 260
12-11 Basic Addressing Process for a DRAM ................................. 264
12-12 Organization of DRAM.......................................................... 265
12-13 Example of an FPM Transfer ................................................. 265
12-14 Example of an EDO Transfer................................................. 266
13-1 Block Diagram of a Standard Memory Hierarchy ................. 269
13-2 Configuration of a Hard Drive Write Head............................ 271
13-3 Sample FM Magnetic Encoding............................................. 273
13-4 Sample MFM Magnetic Encoding ......................................... 274
13-5 RLL Relation between Bit Patterns and Polarity Changes .... 274
13-6 Sample RLL Magnetic Encoding........................................... 275
13-7 Components of Disk Access Time ......................................... 277
13-8 Relation between Read/Write Head and Tracks .................... 279
13-9 Organization of Hard Disk Platter.......................................... 280
13-10 Illustration of a Hard Drive Cylinder ..................................... 281
13-11 Equal Number of Bits per Track versus Equal Sized Bits ..... 282
13-12 Comparison of Sector Organizations ..................................... 282
13-13 Cache Placement between Main Memory and Processor ...... 285
13-14 L1 and L2 Cache Placement................................................... 285
13-15 Split Cache Organization ....................................................... 286
13-16 Organization of Cache into Lines .......................................... 287
13-17 Division of Memory into Blocks............................................ 288
13-18 Organization of Address Identifying Block and Offset ......... 289
13-19 Direct Mapping of Main Memory to Cache........................... 291
13-20 Direct Mapping Partitioning of Memory Address ................. 292
13-21 Fully Associative Partitioning of Memory Address............... 295
13-22 Set Associative Mapping of Main Memory to Cache ............ 297
13-23 Effect of Cache Set Size on Address Partitioning.................. 298
14-1 Sample Protocol Stack using TCP, IP, and Ethernet ............. 307
14-2 Layout of an IEEE 802.3 Ethernet Frame .............................. 308
14-3 Layout of an IP Packet Header............................................... 311
14-4 Layout of a TCP Packet Header............................................. 314
14-5 Position and Purpose of TCP Control Flags .......................... 315
14-6 Layout of a TCP Pseudo Header ............................................ 316
14-7 Simulated Raw Data Capture of an Ethernet Frame .............. 317
15-1 Sample Code Using Conditional Statements ......................... 328
xviii Computer Organization and Design Fundamentals
TABLE OF TABLES
1-1 Unit Prefixes............................................................................. 15
2-1 Converting Binary to Decimal and Hexadecimal .................... 35
2-2 Converting BCD to Decimal .................................................... 36
2-3 Derivation of the Four-Bit Gray Code ..................................... 40
3-1 Representation Comparison for 8-bit Binary Numbers ........... 57
3-2 Hexadecimal to Decimal Conversion Table............................. 62
3-3 Multiplying the Binary Value 10012 by Powers of Two.......... 65
8-1 Addition Results Based on Inputs of a Full Adder ................ 144
8-2 Sum and Carryout Truth Tables for a Full Adder .................. 145
9-1 Truth Table for a Two-Input XOR Gate ................................ 172
9-2 Addition and Subtraction Without Carries or Borrows.......... 181
9-3 Reconstructing the Dividend Using XORs ............................ 183
9-4 Second Example of Reconstructing the Dividend.................. 184
9-5 Data Groupings and Parity for the Nibble 10112 ................... 190
9-6 Data Groupings with a Data Bit in Error ............................... 190
9-7 Data Groupings with a Parity Bit in Error ............................. 191
9-8 Identifying Errors in a Nibble with Three Parity Bits............ 191
9-9 Parity Bits Required for a Specific Number of Data Bits ...... 195
9-10 Membership of Data and Parity Bits in Parity Groups .......... 197
11-1 List of States for Push Button Circuit .................................... 230
11-2 Next State Truth Table for Push Button Circuit..................... 231
11-3 Output Truth Table for Push Button Circuit .......................... 231
11-4 Revised List of States for Push Button Circuit ...................... 233
11-5 List of States for Bit Pattern Detection Circuit ...................... 236
12-1 The Allowable Settings of Four Chip Selects ........................ 247
12-2 Sample Memory Sizes versus Required Address Lines......... 251
15-1 Conditional Jumps to be Placed After a Compare ................. 337
15-2 Conditional Jumps to be Placed After an Operation .............. 338
15-3 Numbered Instructions for Imaginary Processor ................... 340
15-4 Assembly Language for Imaginary Processor ....................... 340
15-5 Operand Requirements for Imaginary Processor ................... 341
15-6 A Simple Program Stored at Memory Address 100016 .......... 342
15-7 Signal Values for Sample I/O Device .................................... 351
15-8 Control Signal Levels for I/O and Memory Transactions...... 353
xx Computer Organization and Design Fundamentals
1
Korwin, Anthony R., Jones, Ronald E., “Do Hands-On, Technology-Based
Activities Enhance Learning by Reinforcing Cognitive Knowledge and Retention?”
Journal of Technology Education, Vol. 1, No. 2, Spring 1990. Online. Internet.
Available WWW: http://scholar.lib.vt.edu/ejournals/JTE/v1n2/pdf/jones.pdf
xxi
xxii Computer Organization and Design Fundamentals
Acknowledgments
I would like to begin by thanking my department chair, Dr. Terry
Countermine, for the support and guidance with which he provided me.
At first I thought that this project would simply be a matter of
converting my existing web notes into a refined manuscript. This was
not the case, and Dr. Countermine's support and understanding were
critical to my success.
I would also like to thank my computer organization students who
tolerated being the test bed of this textbook. Many of them provided
suggestions that strengthened the book, and I am grateful to them all.
Most of all, I would like to thank my wife, Karen, who has always
encouraged and supported me in each of my endeavors. You provide
the foundation of my success.
Lastly, even self-published books cannot be realized without some
support. I would like to thank those who participate as contributors and
moderators on the Lulu.com forums. In addition, I would like to thank
Lulu.com directly for providing me with a quality outlet for my work.
Disclaimer
The information in this book is based on the personal knowledge
collected by David Tarnoff through years of study in the field of
electrical engineering and his work as an embedded system designer.
While he believes this information is correct, he accepts no
responsibility or liability whatsoever with regard to the application of
any of the material presented in this book.
In addition, the design tools presented here are meant to act as a
foundation to future learning. David Tarnoff offers no warranty or
guarantee toward products used or developed with material from this
book. He also denies any liability arising out of the application of any
tool or product discussed in this book. If the reader chooses to use the
material in this book to implement a product, he or she shall indemnify
xxiv Computer Organization and Design Fundamentals
and hold the author and any party involved in the publication of this
book harmless against all claims, costs, or damages arising out of the
direct or indirect application of the material.
David L. Tarnoff
Johnson City, Tennessee
USA
May 11, 2005
tarnoff@etsu.edu
David L. Tarnoff
July 6, 2007
CHAPTER ONE
Digital Signals and Systems
• System design tools – The same design theories used at the lowest
level of system design are also applied at higher levels. For
example, the same methods a circuit board designer uses to create
the interface between a processor and its memory chips are used to
design the addressing scheme of an IP network.
• Software design tools – The same procedures used to optimize
digital circuits can be used for the logic portions of software.
Complex blocks of if-statements, for example, can be simplified or
made to run faster using these tools.
• Improved troubleshooting skills – A clear understanding of the
inner workings of a computer gives the technician servicing it the
tools to isolate a problem quicker and with greater accuracy.
• Interconnectivity – Hardware is needed to connect the real world to
a computer's inputs and outputs. Writing software to control a
system such as an automotive air bag could produce catastrophic
results without a clear understanding of the architecture and
hardware of a microprocessor.
• Marketability – Embedded system design puts microprocessors into
task-specific applications such as manufacturing, communications,
and automotive control. As processors become cheaper and more
powerful, the same tools used for desktop software design are being
applied to embedded system design. This means that the software
1
2 Computer Organization and Design Fundamentals
• What physical values do the digital values that are read from the
sensors represent in the real world?
• How can useful information be pulled from the data stream being
received by the processors?
Chapter 1: Digital Signals and Systems 3
Time
(seconds) Measurement
0.00 0.1987
0.10 0.2955
0.20 0.3894
0.30 0.4794
0.40 0.5646
such as a sound wave. The computer can only measure the signal at
intervals. Each measurement is called a sample. The rate at which these
samples are taken is called the sampling rate. The X's in Figure 1-4
represent these measurements.
Missed
Anomaly
Second, if the computer does not record with enough accuracy (i.e.,
enough digits after the decimal point) an error may be introduced
between the actual measurement and the recorded value.
Accuracy of
computer allows
only these levels Analog Signal
of measurement
Signal Analog
Sensor condi- to digital
tioning converter
Digital
Weak, noisy Strong, clean
measurements
analog signal analog signal
of analog signal
0.3238
0.3254
0.3312
0.3240
0.3221
work together. For the purpose of this discussion, the two values of a
transistor will be referred to as logic 1 and logic 0.
Now let's examine some of the methods used to represent binary
data by first looking at a single binary signal. Assume we are recording
the binary values present on a single wire controlling a light bulb.
Excluding lights controlled by dimmer switches, a light bulb circuit
is a binary system; the light is either on or off, a logic 1 or a logic 0
respectively. Over time, the state of the light bulb changes following
the position of the switch. The top portion of Figure 1-8 represents the
waveform of the binary signal controlling the light bulb based on the
changes in the switch position shown in the lower half of the figure.
Switch A
Switch B
Switch C
Data is in No data
Invalid or
transition is available
undefined data
Two horizontal lines, one at a logic 1 level and one at a logic 0 level
indicate constant signals from all of the lines represented. A single
horizontal line running approximately between logic 1 and logic 0
means that the signals are not sending any data. This is different from
an "off" or logic 0 in that a logic 0 indicates a number while no data
means that the device transmitting the data is not available. Hash marks
indicate invalid or changing data. This could mean that one or all of the
signals are changing their values, or that due to the nature of the
electronics, the values of the data signals cannot be predicted. In the
later case, the system may need to wait to allow the signals to stabilize.
1.6.1 Edges
A single binary signal can have one of two possible transitions as
shown in Figure 1-11. The first one, a transition from a logic 0 to a
logic 1, is called a rising edge transition. The second one, a transition
from a logic 1 to a logic 0 is called a falling edge transition.
1.6.2 Pulses
A binary pulse occurs when a signal changes from one value to the
other for a short period, then returns to its original value. Examples of
this type of signal might be the power-on or reset buttons on a
10 Computer Organization and Design Fundamentals
Logic 1 Logic 1
Logic 0 Logic 0
a.) Rising Edge b.) Falling Edge
Figure 1-11 Digital Transition Definitions
Logic 1 Logic 1
Logic 0 Logic 0
a.) Positive-going b.) Negative-going
Like music, the duration of the notes or the spaces between the notes
can be longer or shorter. On the page, they do not look meaningful, but
once the reader is given the tools to interpret the signal, the data they
contain becomes clear.
Chapter 1: Digital Signals and Systems 11
Period = T Period = T
a)
b)
The units of tw is seconds. Its value will always be greater than zero
and less than the period. A tw of zero implies the signal has no pulses,
and if tw equaled the period, then the signal would never go low.
12 Computer Organization and Design Fundamentals
1
Frequency = (1.1)
Period in seconds
Example
If it takes 0.1 seconds for a periodic pulse train to make a complete
cycle or period, what is that waveform's frequency?
Solution
1
Frequency =
Period in seconds
1
Frequency =
0.1 seconds
Frequency = 10 Hz
Example
If a computer’s system clock is 2 Gigahertz (2,000,000,000 Hz),
what is the duration of its system clock’s period?
Solution
Inverting Equation 1.1 gives us the equation used to determine the
period from the frequency.
1
Period = Frequency
1
Period = 2,000,000,000 Hz
Period = 0.0000000005 seconds = 0.5 nanoseconds
Logic 1
Logic 0
T ¼T
Figure 1-16 Periodic Pulse Train with 25% Duty Cycle
Equation 1.2 represents the formula used to calculate the duty cycle
where both tw and T have units of seconds.
Since the range of tw is from 0 to T, then the duty cycle has a range
from 0% (a constant logic 0) to 100% (a constant logic 1).
Example
The typical human eye cannot detect a light flashing on and off at
frequencies above 40 Hz. For example, fluorescent lights flicker at a
low frequency, around 60 Hz, which most people cannot see. (Some
people can detect higher frequencies and are sensitive to what they
correctly perceive as the flashing of fluorescent lights.)
For higher frequencies, a periodic pulse train sent to a light appears
to the human eye to simply be dimmer than the same light sent a
constant logic 1. This technique can be used to dim light emitting
diodes (LEDs), devices that respond to only logic 1's or logic 0's. The
14 Computer Organization and Design Fundamentals
Example
Assume that a 1 kHz (1,000 Hz) periodic pulse train is sent to an
LED. What should the pulse width (tw) be to make the light emitted
from the LED one-third of its full capability?
Solution
Examining equation 1.2 shows that to determine the pulse width, we
must first get the values for the period and the duty cycle.
The duty cycle is equal to the level of intensity that the LED is to be
lit, i.e., one-third or 33%. The period, T, is equal to one over the
frequency.
1
Period = Frequency
1
Period = 1,000 Hz
Period = 0.001 seconds
To determine the pulse width, solve equation 1.2 for tw, then
substitute the values for the period and the duty cycle.
tw
Duty Cycle = x 100%
T
T x (Duty Cycle)
tw =
100%
tw = 0.001 seconds x 0.33
To use the table, just substitute the prefix for its power of ten. For
example, substitute 10-6 for the prefix "μ" in the value 15.6 μS. This
would give us 15.6 x 10-6 seconds, which in turn equals 0.0000156
seconds.
16 Computer Organization and Design Fundamentals
Problems
1. Define the term "sample" as it applies to digital systems.
2. Define the term "sampling rate" as it applies to digital systems.
3. What are the two primary problems that sampling could cause?
4. Name the three parts of the system used to input an analog signal
into a digital system and describe their purpose.
5. Name four benefits of a digital system over an analog system.
6. Name three drawbacks of a digital system over an analog system.
7. True or False: Since non-periodic pulse trains do not have a
predictable format, there are no defining measurements of the
signal.
8. If a computer runs at 12.8 GHz, what is the period of its clock
signal?
9. If the period of a periodic pulse train is 125 nanoseconds, what is
the signal's frequency?
10. If the period of a periodic pulse train is 50 microseconds, what
should the pulse width, tw, be to achieve a duty cycle of 15%?
11. True or False: A signal’s frequency can be calculated from its duty
cycle alone.
CHAPTER TWO
Numbering Systems
would be like saying we could only use the hundreds, tens, and ones
place when counting in decimal.
This has two results. First, it limits the number of values we can
represent. For our example where we are only allowed to count up to
the hundreds place in decimal, we would be limited to the range of
values from 0 to 999.
Second, we need a way to show others that we are limiting the
number of digits. This is usually done by adding leading zeros to the
number to fill up any unused places. For example, a decimal 18 would
be written 018 if we were limited to three decimal digits.
Counting with bits, hereafter referred to as counting in binary, is
subject to these same issues. The only difference is that decimal uses
ten symbols (0, 1, 2, 3, 4, 5, 6, 7, 8, and 9) while binary only uses two
symbols (0 and 1).
To begin with, Figure 2-2 shows that when counting in binary, we
run out of symbols quickly requiring the addition of another "place"
after only the second increment.
If we were counting using four bits, then the sequence would look
like: 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001,
1010, 1011, 1100, 1101, 1110, and 1111. Notice that when restricted to
four bits, we reach our limit at 1111, which happens to be the fifteenth
value. It should also be noted that we ended up with 2 x 2 x 2 x 2 = 16
different values. With two symbols for each bit, we have 2n possible
combinations of symbols where n represents the number of bits.
In decimal, we know what each digit represents: ones, tens,
hundreds, thousands, etc. How do we figure out what the different
digits in binary represent? If we go back to decimal, we see that each
place can contain one of ten digits. After the ones digit counts from 0 to
Chapter 2: Numbering Systems 19
this book and a subscript "10" is placed at the end of all decimal
numbers. This means a binary 100 should be written as 1002 and a
decimal 100 should be written as 10010.
Bit 12
Nibble 10102
Byte 101001012
Word 10100101111100002
Double Word 101001011111000011001110111011012
Numbered bit
7 6 5 4 3 2 1 0
position
Corresponding
27 26 25 24 23 22 21 20
power of 2
Decimal equivalent
128 64 32 16 8 4 2 1
of power of 2
Bit Position 7 6 5 4 3 2 1 0
Binary Value 1 0 1 1 0 1 0 0
101101002 = 27 + 25 + 24 + 22
= 12810 + 3210 + 1610 + 410
= 18010
The largest unsigned eight bit number we can store has a 1 in all
eight positions, i.e., 111111112. This number cannot be incremented
without forcing an overflow to the next highest bit. Therefore, the
largest decimal value that 8 bits can represent in unsigned binary is the
sum of all powers of two from 0 to 7.
111111112 = 27 + 26 + 25 + 24 + 23 + 22 + 21 + 20
= 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1
= 25510
If you add one to this value, the result is 256 which is 28, the power
of two for the next bit position. This makes sense because if you add 1
to 111111112, then beginning with the first column, 1 is added to 1
giving us a result of 0 with a 1 carry to the next column. This
propagates to the MSB where a final carry is passed to the ninth bit.
The final value is then 1000000002 = 25610.
We can look at this another way. Each digit of a binary number can
take on 2 possible values, 0 and 1. Since there are two possible values
for the first digit, two possible values for the second digit, two for the
third, and so on until you reach the n-th bit, then we can find the total
number of possible combinations of 1's and 0's for n-bits by
multiplying 2 n-times, i.e., 2n.
How does this fit with our upper limit of 2n-1? Where does the "-1"
come from? Remember that counting using unsigned binary integers
begins at 0, not 1. Giving 0 one of the bit patterns takes one away from
the maximum value.
Chapter 2: Numbering Systems 23
Start
Is new
decimal value equal
to zero? No
Yes
End
Example
Convert the decimal value 13310 to an 8 bit unsigned binary number.
Solution
Since 13310 is less than 28 – 1 = 255, 8 bits will be sufficient for this
conversion. Using Figure 2-4, we see that the largest power of 2 less
than or equal to 13310 is 27 = 128. Therefore, we place a 1 in bit
position 7 and subtract 128 from 133.
Bit position 7 6 5 4 3 2 1 0
1
133 – 128 = 5
Chapter 2: Numbering Systems 25
Our new decimal value is 5. Since this is a non-zero value, our next
step is to find the largest power of 2 less than or equal to 5. That would
be 22 = 4. So we place a 1 in the bit position 2 and subtract 4 from 5.
Bit position 7 6 5 4 3 2 1 0
1 1
5–4=1
Our new decimal value is 1, so find the largest power of 2 less than
or equal to 1. That would be 20 = 1. So we place a 1 in the bit position 0
and subtract 1 from 1.
Bit position 7 6 5 4 3 2 1 0
1 1 1
1–1=0
Bit position 7 6 5 4 3 2 1 0
1 0 0 0 0 1 0 1
13310 = 100001012
Remember that a single binary bit can be set to only one of two
values: logic 1 or logic 0. Combining many bits together allows for a
range of integers, but these are still discrete values. The real world is
analog, values represented with floating-point measurements capable of
infinite resolution. To use an n-bit binary number to represent analog,
we need to put some restrictions on what is being measured.
First, an n-bit binary number has a limited range. We saw this when
converting unsigned positive integers to binary. In this case, the lower
limit was 0 and the upper limit was 2n-1. To use n-bits to represent an
analog value, we need to restrict the allowable range of analog
measurements. This doesn't need to be a problem.
For example, does the typical bathroom scale need to measure
values above 400 pounds? If not, then a digital system could use a 10-
bit binary number mapped to a range from zero to 400 pounds. A
binary 00000000002 could represent zero pounds while 11111111112
could represent 400 pounds.
What is needed next is a method to map the values inside the range
zero to 400 pounds to the binary integers in the range 00000000002 to
11111111112. To do this, we need a linear function defining a one-to-
one mapping between each binary integer and the analog value it
represents. To do this, we turn to the basic math expression for a linear
function.
y = mx + b
Chapter 2: Numbering Systems 27
That means that each time the binary number increments, e.g.,
01101100102 goes to 01101100112, it represents an increment in the
analog value of 0.391 pounds. Since a binary value of 00000000002
represents an analog value of 0 pounds, then 00000000012 represents
0.391 pounds, 00000000102 represents 2 × 0.391 = 0.782 pounds,
00000000112 represents 3 × 0.391 = 1.173 pounds, and so on.
In some cases, the lower limit might be something other than 0. This
is important especially if better accuracy is required. For example, a
kitchen oven may have an upper limit of 600oF. If zero were used as the
lower limit, then the temperature range 600oF – 0oF = 600oF would
need to be mapped to the 2n possible binary values of an n-bit binary
number. For a 9-bit binary number, this would result in an m of:
600oF– 0oF
m= = 1.1742 degrees/binary increment
29 – 1
28 Computer Organization and Design Fundamentals
600oF– 100oF
m= = 0.9785 degrees/binary increment
29 – 1
The smaller increment means that each binary value will be a more
accurate representation of the analog value.
This non-zero lower limit is realized as a non-zero value for b in the
linear expression y=mx + b. Since y is equal to b when x is equal to
zero, then b must equal the lower limit of the range.
Amax - Amin
Acalc = ( 2n - 1 )
* X + Amin (2.4)
where:
Example
Assume that the processor monitoring the temperature of an oven
with a temperature range from 100oF to 600oF measures a 9-bit binary
value of 0110010102. What temperature does this represent?
Solution
Earlier, we calculated the rate of change, m, for an oven with a
temperature range from 100oF to 600oF is 500oF ÷ 511 binary
Chapter 2: Numbering Systems 29
500 o
temperature = F/binary increment * binary value + 100oF
511
0110010102 = 27 + 26 + 23 + 21
= 128 + 64 + 8 + 2
= 20210
500oF
temperature = * 202 + 100oF
511
temperature = 297.65oF
The value from the above example is slightly inaccurate. The binary
value 0110010102 actually represents a range of values 0.9785oF wide
centered around or with a lower limit of 297.65oF. Only a binary value
with an infinite number of bits would be entirely accurate. Since this is
not possible, there will always be a gap or resolution associated with a
digital system due to the quantized nature of binary integer values. That
gap is equivalent to the increment or rate of change of the linear
expression.
Analog range
Resolution = (2.5)
2n – 1
Example
Assume that the analog range of a system using a 10-bit analog-to-
digital converter goes from a lower limit of 5 ounces to an upper limit
of 11 ounces. What is the resolution of this system?
Solution
To determine the resolution, we begin with the analog range.
30 Computer Organization and Design Fundamentals
Substituting this range into equation 2.5 and using n=10 to represent
the number of bits, we get:
6 ounces
Resolution =
210 - 1
6 ounces
=
1023 increments
= 0.005865 oz/inc
Example
How many bits would be needed for the example above to improve
the resolution to better than 0.001 ounces per increment?
Solution
Each time we increase the number of bits in our binary integer by
one, the number of increments in the range is approximately doubled.
For example, going from 10 bits to 11 bits increases the number of
increments in the range from 210 – 1 = 1023 to 211 – 1 = 2047. The
question is how high do we have to go to get to a specified resolution?
To answer that, let's begin by setting Equation 2.5 to represent the fact
that we want a resolution of better than 0.001 ounces/increment.
6 ounces
0.001 oz/inc. >
2n – 1
6 ounces
2n – 1 >
0.001 oz/inc.
To avoid aliasing, the rate at which samples are taken must be more
than twice as fast as the highest frequency you wish to capture. This is
called the Nyquist Theorem. For example, the sampling rate for audio
CDs is 44,100 samples/second. Dividing this number in half gives us
the highest frequency that an audio CD can play back, i.e., 22,050 Hz.
For an analog telephone signal, a single sample is converted to an 8-
bit integer. If these samples are transmitted across a single channel of a
T1 line which has a data rate of 56 Kbps (kilobits per second), then we
can determine the sampling rate.
56,000 bits/second
Sampling rateT1 =
8 bits/sample
This means that the highest analog frequency that can be transmitted
across a telephone line using a single channel of a T1 link is 7,000÷2 =
3,500 Hz. That's why the quality of voices sent over the telephone is
poor when compared to CD quality. Although telephone users can still
34 Computer Organization and Design Fundamentals
recognize the voice of the caller on the opposite end of the line when
the higher frequencies are eliminated, their speech often sounds muted.
00002 = 010
00012 = 110
00102 = 210
: : :
10002 = 810
10012 = 910
10102 = A
10112 = B
: : :
11112 = F
Table 2-1 presents the mapping between the sixteen patterns of 1's
and 0's in a binary nibble and their corresponding decimal and
hexadecimal (hex) values.
For example, the BCD value 0001 0110 1001 0010 equals 169210.
Chapter 2: Numbering Systems 37
As the shaft turns, a sensor can detect which of the shaft's arcs it is
aligned with by reading a digital value and associating it with a specific
arc. By remembering the previous position and timing the changes
between positions, a processor can also compute speed and direction.
Figure 2-10 shows how a shaft's position might be divided into eight
arcs using three bits. This would allow a processor to determine the
shaft's position to within 360o/8 = 45o.
000
001
111
010
110
011
101
100
One type of shaft position sensor uses a disk mounted to the shaft
with slots cut into the disk at different radii representing different bits.
Light sources are placed on one side of the disk while sensors on the
other side of the disk detect when a hole is present, i.e., the sensor is
receiving light. Figure 2-11 presents a disk that might be used to
identify the shaft positions of the example from Figure 2-10.
Sensor
Light Sources
In its current position in the figure, the slots in the disk are lined up
between the second and third light sensors, but not the first. This means
that the sensor will read a value of 110 indicating the shaft is in
position number 1102 = 6.
There is a potential problem with this method of encoding. It is
possible to read the sensor at the instant when more than one gap is
opening or closing between its light source and sensor. When this
happens, some of the bit changes may be detected while others are not.
If this happens, an erroneous measurement may occur.
For example, if the shaft shown above turns clockwise toward
position 1012 = 5, but at the instant when the sensor is read, only the
first bit change is detected, then the value read will be 1112 = 7
indicating counter-clockwise rotation.
To solve this problem, alternate counting sequences referred to as
the Gray code are used. These sequences have only one bit change
between values. For example, the values assigned to the arcs of the
above shaft could follow the sequence 000, 001, 011, 010, 110, 111,
101, 100. This sequence is not correct numerically, but as the shaft
turns, only one bit will change as the shaft turns from one position to
the next.
There is an algorithm to convert an n-bit unsigned binary value to its
corresponding n-bit Gray code. Begin by adding a 0 to the most
significant end of the unsigned binary value. There should now be n
boundaries between the n+1 bits. For each boundary, write a 0 if the
adjacent bits are the same and a 1 if the adjacent bits are different. The
resulting value is the corresponding n-bit Gray code value. Figure 2-12
presents an example converting the 6 bit value 1000112 to Gray code.
Using this method, the Gray code for any binary value can be
determined. Table 2-3 presents the full Gray code sequence for four
bits. The shaded bits in third column are bits that are different then the
bit immediately to their left. These are the bits that will become ones in
the Gray code sequence while the bits not shaded are the ones that will
be zeros. Notice that exactly one bit changes in the Gray code from one
row to the next and from the bottom row to the top row.
Binary
Gray
Decimal Binary w/starting
Code
zero
0 0000 00000 0000
1 0001 00001 0001
2 0010 00010 0011
3 0011 00011 0010
4 0100 00100 0110
5 0101 00101 0111
6 0110 00110 0101
7 0111 00111 0100
8 1000 01000 1100
9 1001 01001 1101
10 1010 01010 1111
11 1011 01011 1110
12 1100 01100 1010
13 1101 01101 1011
14 1110 01110 1001
15 1111 01111 1000
Problems
1. What is the minimum number of bits needed to represent 76810
using unsigned binary representation?
2. What is the largest possible integer that can be represented with a
6-bit unsigned binary number?
3. Convert each of the following values to decimal.
a) 100111012 b) 101012 c) 1110011012 d) 011010012
4. Convert each of the following values to an 8-bit unsigned binary
value.
a) 3510 b) 10010 c) 22210 d) 14510
5. If an 8-bit binary number is used to represent an analog value in
the range from 010 to 10010, what does the binary value 011001002
represent?
6. If an 8-bit binary number is used to represent an analog value in
the range from 32 to 212, what is the accuracy of the system? In
other words, if the binary number is incremented by one, how
much change does it represent in the analog value?
7. Assume a digital to analog conversion system uses a 10-bit integer
to represent an analog temperature over a range of -25oF to 125oF.
If the actual temperature being read was 65.325oF, what would be
the closest possible value that the system could represent?
8. What is the minimum sampling rate needed in order to successfully
capture frequencies up to 155 KHz in an analog signal?
9. Convert the following numbers to hexadecimal.
a) 10101111001011000112
b) 100101010010011010012
c) 011011010010100110012
d) 101011001000102
10. Convert each of the following hexadecimal values to binary.
a) ABCD16 b) 1DEF16 c) 864516 d) 925A16
42 Computer Organization and Design Fundamentals
43
44 Computer Organization and Design Fundamentals
1
0 0 1 1
+ 0 + 1 + 0 + 1
0 1 1 10
Previous 1 1 1
Carry Æ 1 1 1 1
0 0 1 1
+ 0 + 1 + 0 + 1
1 10 10 11
Figure 3-2 Four Possible Results of Adding Two Bits with Carry
Chapter 3: Binary Math and Signed Representations 45
The second and third cases are similar to the last case presented in
Figure 3-1 where two 1's are added together to get a result of 0 with a
carry. The last case in Figure 3-2, however, has three 1's added together
which equals 310. Subtracting 2 from this result places a new result of 1
in the current column and sends a carry to the next column. And just as
in decimal addition, the carry in binary is never greater than 1.
Now let's try to add binary numbers with multiple digits. The
example shown below presents the addition of 100101102 and
001010112. The highlighted values are the carries from the previous
column's addition, and just as in decimal addition, they are added to the
next most significant digit/bit.
1 1 1 1 1
1 0 0 1 0 1 1 0
+ 0 0 1 0 1 0 1 1
1 1 0 0 0 0 0 1
Minuend Æ 0 1 1
Subtrahend Æ - 0 - 0 - 1
0 1 0
But what happens in the one case when the minuend is less than the
subtrahend? As in decimal, a borrow must be taken from the next most
significant digit. The same is true for binary.
A "borrow" is made from
the next highest bit position
1 0
- 1
1
46 Computer Organization and Design Fundamentals
Pulling 1 from the next highest column in binary allows us to add 102
or a decimal 2 to the current column. For the previous example, 102
added to 0 gives us 102 or a decimal 2. When we subtract 1 from 2, the
result is 1.
Now let's see how this works with a multi-bit example.
0 1 0
1
1 0 10 1 1 1 10 1 1
- 0 0 1 0 1 0 1 0 1
0 1 1 1 0 0 1 1 0
Starting at the rightmost bit, 1 is subtracted from 1 giving us zero. In
the next column, 0 is subtracted from 1 resulting in 1. We're okay so far
with no borrows required. In the next column, however, 1 is subtracted
from 0. Here we need to borrow from the next highest digit.
The next highest digit is a 1, so we subtract 1 from it and add 10 to
the digit in the 22 column. (This appears as a small "1" placed before
the 0 in the minuend's 22 position.) This makes our subtraction 10 - 1
which equals 1. Now we go to the 23 column. After the borrow, we
have 0 – 0 which equals 0.
We need to make a borrow again in the third column from the left,
the 26 position, but the 27 position of the minuend is zero and does not
have anything to borrow. Therefore, the next highest digit of the
minuend, the 28 position, is borrowed from. The borrow is then
cascaded down until it reaches the 26 position so that the subtraction
may be performed.
Previous value 1 0 0 1 0 1 1 1
1's complement 0 1 1 0 1 0 0 0
1 0 0 1 0 1 1 0
+ 0 1 1 0 1 0 0 1
1 1 1 1 1 1 1 1
0 1 1 0 1 0 0 0
+ 1
0 1 1 0 1 0 0 1
1 1 1 1 1 1 1 1
1 0 0 1 0 1 1 1
+ 0 1 1 0 1 0 0 1
0 0 0 0 0 0 0 0
The result is zero! Okay, so most of you caught the fact that I didn't
drop down the last carry which would've made the result 1000000002.
This is not a problem, because in the case of signed arithmetic, the
carry has a purpose other than that of adding an additional digit
representing the next power of two. As long as we make sure that the
two numbers being added have the same number of bits, and that we
keep the result to that same number of bits too, then any carry that goes
beyond that should be discarded.
Actually, discarded is not quite the right term. In some cases we will
use the carry as an indication of a possible mathematical error. It should
not, however, be included in the result of the addition. This is simply
the first of many "anomalies" that must be watched when working with
a limited number of bits.
Two more examples of 2's complements are shown below.
1 1 1 1
0 1 0 1 1 0 0 0
+ 1 1 1 1 0 1 1 0
0 1 0 0 1 1 1 0
Chapter 3: Binary Math and Signed Representations 49
0 1 0 1 1 0 0 0
Original value = 45 0 0 1 0 1 1 0 1
1's complement of 45 1 1 0 1 0 0 1 0
2's complement of 45 = -45 1 1 0 1 0 0 1 1
1's complement of -45 0 0 1 0 1 1 0 0
2's complement of -45 = 45 0 0 1 0 1 1 0 1
It worked! The second time the 2's complement was taken, the
pattern of ones and zeros returned to their original values. It turns out
that this is true for any binary number of a fixed number of bits.
50 Computer Organization and Design Fundamentals
used to determine the magnitude since the MSB is used for the sign.
This cuts in half the number of positive integers n bits can represent.
And the special cases? Well, a binary number with all zeros is equal
to a decimal 0. Taking the negative of zero still gives us zero. The other
case is a bit trickier. In the section on minimums and maximums, we
will see that an n-bit value with an MSB equal to one and all other bits
equal to zero is a negative number, specifically, –2(n-1). The largest
positive number represented in 2's complement has an MSB of 0 with
all the remaining bits set to one. This value equals 2(n-1) – 1. Therefore,
since 2(n-1) > 2(n-1) – 1, we can see that there is no positive equivalent to
the binary number 100…002.
+4510 in binary 0 0 1 0 1 1 0 1
–4510 using signed magnitude 1 0 1 0 1 1 0 1
Start
Positive Negative
Number Number
No Does Yes
MSB=1? Take 2's
complement
Convert to
decimal using Convert to
unsigned decimal using
integer method unsigned
integer method
Insert negative
sign
End
Negative value 1 0 1 0 0 1 1 0
1's comp. of negative value 0 1 0 1 1 0 0 1
2's comp. of negative value 0 1 0 1 1 0 1 0
Now that we have the positive counterpart for the 2's complement
value of the negative number 101001102, we convert it to decimal just
as we did with the unsigned binary value.
010110102 = 26 + 24 + 23 + 21 = 64 + 16 + 8 + 2 = 9010
Since the original 2's complement value was negative to begin with,
the value 101001102 in 8-bit, 2's complement form is –90.
54 Computer Organization and Design Fundamentals
01001102 = 25 + 22 + 21 = 32 + 4 + 2 = 3810
Since the MSB of the original value equaled 1, the signed magnitude
value was a negative number to begin with, and we need to add a
negative sign. Therefore, 101001102 in 8-bit, signed magnitude
representation equals –3810.
But what if this binary number was actually a 10-bit number and not
an 8 bit number? Well, if it's a 10 bit number (00101001102), the MSB
is 0 and therefore it is a positive number. This makes our conversion
much easier. The method for converting a positive binary value to a
decimal value is the same for all three representations. The conversion
goes something like this:
Next, let's examine the minimum and maximum values for an n-bit
2's complement representation. Unlike the unsigned case, the lowest
decimal value that can be represented with n-bits in 2's complement
representation is not obvious. Remember, 2's complement uses the
MSB as a sign bit. Since the lowest value will be negative, the MSB
should be set to 1 (a negative value). But what is to be done with all of
the remaining bits? A natural inclination is to set all the bits after the
56 Computer Organization and Design Fundamentals
MSB to one. This should be a really big negative number, right? Well,
converting it to decimal results in something like the 8 bit example
below:
This isn't quite what we expected. Using the 2's complement method
to convert 111111112 to a decimal number results in –110. This couldn't
possibly be the lowest value that can be represented with 2's
complement.
It turns out that the lowest possible 2's complement value is an MSB
of 1 followed by all zeros as shown in the 8 bit example below. For the
conversion to work, you must strictly follow the sequence presented in
Figure 3-4 to convert a negative 2's complement value to decimal.
Number of integers
Representation Minimum Maximum
represented
Unsigned 0 255 256
2's Complement -128 127 256
Signed Magnitude -127 127 255
Exponent 3 2 1 0 -1 -2 -3 -4
Position value 1000 100 10 1 0.1 0.01 0.001 0.0001
Sample values 0 0 0 6 5 3 4 2
58 Computer Organization and Design Fundamentals
Therefore, our example has the decimal value 6*1 + 5*0.1 + 3*0.01 +
4*0.001 + 2*0.0001 = 6.5342.
Binary representation of real numbers works the same way except
that each position represents a power of two, not a power of ten. To
convert 10.01101 to decimal for example, use descending negative
powers of two to the right of the decimal point.
Exponent 2 1 0 -1 -2 -3 -4 -5
Position value 4 2 1 0.5 0.25 0.125 0.0625 0.03125
Sample values 0 1 0 0 1 1 0 1
Therefore, our example has the decimal value 0*4 + 1*2 + 0*1
+0*0.5 + 1*0.25 + 1*0.125 + 0*0.0625 + 1*0.03125 = 2.40625. This
means that the method of conversion is the same for real numbers as it
is for integer values; we've simply added positions representing
negative powers of two.
Computers, however, use a form of binary more like scientific
notation to represent floating-point or real numbers. For example, with
scientific notation we can represent the large value 342,370,000 as
3.4237 x 108. This representation consists of a decimal component or
mantissa of 3.4237 with an exponent of 8. Both the mantissa and the
exponent are signed values allowing for negative numbers and for
negative exponents respectively.
Binary works the same way using 1's and 0's for the digits of the
mantissa and exponent and using 2 as the multiplier that moves the
decimal point left or right. For example, the binary number
100101101.010110 would be represented as:
1.00101101010110 * 28
The decimal point is moved left for negative exponents of two and right
for positive exponents of two.
The IEEE Standard 754 is used to represent real numbers on the
majority of contemporary computer systems. It utilizes a 32-bit pattern
to represent single-precision numbers and a 64-bit pattern to represent
double-precision numbers. Each of these bit patterns is divided into
three parts, each part representing a different component of the real
number being stored. Figure 3-5 shows this partitioning for both single-
and double-precision numbers.
Chapter 3: Binary Math and Signed Representations 59
Sign
bit
Exponent, E Fraction, F
a) Single-Precision
Sign
bit
Exponent, E Fraction, F
b) Double-Precision
Both formats work the same differing only by the number of bits
used to represent each component of the real number. In general, the
components of the single-precision format are substituted into Equation
3.7 where the sign of the value is determined by the sign bit (0 –
positive value, 1 – negative value). Note that E is in unsigned binary
representation.
Example
Convert the 32-bit single-precision IEEE Standard 754 number
shown below into its binary equivalent.
11010110101101101011000000000000
Solution
First, break the 32-bit number into its components.
1 10101101 01101101011000000000
Exponent, E = 101011012
= 27 + 25 + 23 + 22 + 20
= 128 + 32 + 8 + 4 + 1
= 17310
Example
Create the 32-bit single-precision IEEE Standard 754 representation
of the binary number 0.000000110110100101
Solution
Begin by putting the binary number above into the binary form of
scientific notation with a single 1 to the left of the decimal point. Note
that this is done by moving the decimal point seven positions to the
right giving us an exponent of –7.
Chapter 3: Binary Math and Signed Representations 61
The number is positive, so the sign bit will be 0. The fraction (value
after the decimal point and not including the leading 1) is 10110100101
with 12 zeros added to the end to make it 23 bits. Lastly, the exponent
must satisfy the equation:
E – 127 = –7
E = –7 + 127 = 120
0 01111000 10110100101000000000
Therefore, D16 added to 516 equals 216 with a carry to the next column.
Just like decimal and binary, the addition of two hexadecimal digits
never generates a carry greater than 1. The following shows how
adding the largest hexadecimal digit, F16, to itself along with a carry
from the previous column still does not require a carry larger than 1 to
the next highest column.
Example
Add 3DA3216 to 4292F16.
Solution
Just like in binary and decimal, place one of the numbers to be
added on top of the other so that the columns line up.
3 D A 3 2
+ 4 2 9 2 F
Adding 216 to F16 goes beyond the limit of digits hexadecimal can
represent. It is equivalent to 210 + 1510 which equals 1710, a value
greater than 1610. Therefore, we need to subtract 1016 (1610) giving us a
result of 1 with a carry into the next position.
1
3 D A 3 2
+ 4 2 9 2 F
1
1
3 D A 3 2
+ 4 2 9 2 F
6 1
The 162 position has A16 + 916 which in decimal is equivalent to 1010
+ 910 = 1910. Since this is greater than 1610, we must subtract 1610 to get
the result for the 162 column and add a carry in the 163 column.
1 1
3 D A 3 2
+ 4 2 9 2 F
3 6 1
64 Computer Organization and Design Fundamentals
For the 163 column, we have 116 + D16 + 216 which is equivalent to
110 + 1310 + 210 = 1610. This gives us a zero for the result in the 163
column with a carry.
1 1 1
3 D A 3 2
+ 4 2 9 2 F
0 3 6 1
1 1 1
3 D A 3 2
+ 4 2 9 2 F
8 0 3 6 1
BCD Decimal
0011 3
+1000 +8
1011 Invalid
+0110 +6
10001 11
Chapter 3: Binary Math and Signed Representations 65
BCD Decimal
1001 9
+1000 +8
10001 Carry
+0110 +6
10111 17
Binary
Decimal 28 27 26 25 24 23 22 21 20
9 0 0 0 0 0 1 0 0 1
18 0 0 0 0 1 0 0 1 0
36 0 0 0 1 0 0 1 0 0
72 0 0 1 0 0 1 0 0 0
144 0 1 0 0 1 0 0 0 0
288 1 0 0 1 0 0 0 0 0
Note that multiplying by two has the same effect as shifting all of
the bits one position to the left. Similarly, a division by two is
accomplished by a right shift one position. This is similar to moving a
decimal point right or left when multiplying or dividing a decimal
number by a power of ten.
Since a shift operation is significantly faster than a multiply or
divide operation, compilers will always substitute a shift operation
when a program calls for a multiply or divide by a power of two. For
example, a division by 1610 = 24 is equivalent to a right shift by 4 bit
positions.
66 Computer Organization and Design Fundamentals
When shifting 1 0 1 0 0 0 1 0
right, fill in bits to
the left with copies
of the MSB.
1 1 0 1 0 0 0 1
Figure 3-6 Duplicate MSB for Right Shift of 2's Complement Values
The first line of code shifts iVal left three positions before putting
the new value into result. This is equivalent to multiplying iVal by
23 = 8. The second line shifts iVal right 4 positions which has the same
effect as an integer divide by 24 = 16.
1
1 1 0 0 1 0 0 0
+ 1 0 1 0 1 1 1 1
0 1 1 1 0 1 1 1
Remember that the result must have the same bit count as the
sources, and in this case, the 8-bit unsigned binary result 011101112
equals 11910, not 37510.
When adding unsigned binary values, there is a simple way to
determine if an arithmetic overflow has occurred. In unsigned binary
addition, if a carry is produced from the column representing the MSBs
thereby requiring another bit for the representation, an overflow has
occurred.
In 2's complement addition, there is a different method for
determining when an arithmetic overflow has occurred. To begin with,
remember that an arithmetic overflow occurs when the result falls
outside the minimum and maximum values of the representation. In the
case of 2's complement representation, those limits are defined by
Equations 3.3 and 3.4.
The only way that this can happen is if two numbers with the same
sign are added together. It is impossible for the addition of two numbers
with different signs to result in a value outside of the range of 2's
complement representation.
When two numbers of the same sign are added together, however,
there is a simple way to determine if an error has occurred. If the result
of the addition has the opposite sign of the two numbers being added,
then the result is in error. In other words, if the addition of two positive
numbers resulted in a negative number, or if the addition of two
negative numbers resulted in a positive number, there were not enough
bits in the representation to hold the result. The example below presents
one possible case.
If this had been done assuming unsigned notation, the result of 15210
would have been fine because no carry was generated. From equation
3.4, however, we see that the largest value that 8-bit 2's complement
representation can hold is 2(8-1) – 1 = 12710. Since 15210 is greater than
12710, it is outside the range of 8-bit 2's complement representation. In
2's complement representation, the bit pattern 100110002 actually
represents -10410.
Problems
1. True or False: 011010112 has the same value in both unsigned and
2's complement form.
2. True or False: The single-precision floating-point number
10011011011010011011001011000010 is negative.
3. What is the lowest possible value for an 8-bit signed magnitude
binary number?
4. What is the highest possible value for a 10-bit 2's complement
binary number?
5. Convert each of the following decimal values to 8-bit 2's
complement binary.
a) 5410 b) –4910 c) –12810 d) –6610 e) –9810
6. Convert each of the following 8-bit 2's complement binary
numbers to decimal.
a) 100111012 b) 000101012 c) 111001102 d) 011010012
70 Computer Organization and Design Fundamentals
For example, the algorithm for a specific gate may cause a one to be
output if an odd number of ones are present at the gate's input and a
zero to be output if an even number of ones is present.
A number of standard gates exist, each one of which has a specific
symbol that uniquely identifies its function. Figure 4-2 presents the
symbols for the four primary types of gates that are used in digital
circuit design.
71
72 Computer Organization and Design Fundamentals
0 1 1 0
Note that with a single input, the NOT gate has only 2 possible states.
0 0
0 0
0 1
1 1
0 1
0 1
4.1.3 OR Gate
An OR gate outputs a logic 1 if any of its inputs are a logic 1. An
OR gate only outputs a logic 0 if all of its inputs are logic 0. The OR
gate in Figure 4-2 has only two inputs, but just like the AND gate, an
OR gate may have as many inputs as the circuit requires. Regardless of
the number of inputs, if any input is a logic 1, the output is a logic 1.
A common example of an OR gate circuit is a security system.
Assume that a room is protected by a system that watches three inputs:
a door open sensor, a glass break sensor, and a motion sensor. If none
of these sensors detects a break-in condition, i.e., they all send a logic 0
to the OR gate, the alarm is off (logic 0). If any of the sensors detects a
break-in, it will send a logic 1 to the OR gate which in turn will output
a logic 1 indicating an alarm condition. It doesn't matter what the other
sensors are reading, if any sensor sends a logic 1 to the gate, the alarm
should be going off. Another way to describe the operation of this
circuit might be to say, "The alarm goes off if the door opens or the
glass breaks or motion is detected." Once again, the use of the word
"or" suggests that this circuit should be implemented with an OR gate.
74 Computer Organization and Design Fundamentals
0 0
0 1
0 1
1 1
1 1
0 1
1 1
1 0
0 1
A=0, B=0, C=0 A=0, B=1, C=0 A=1, B=0, C=0 A=1, B=1, C=0
A=0, B=0, C=1 A=0, B=1, C=1 A=1, B=0, C=1 A=1, B=1, C=1
This means that a truth table representing a circuit with three inputs
would have 8 rows. Figure 4-7 presents a sample truth table for a
digital circuit with three inputs, A, B, and C, and one output, X. Note
that the output X doesn't represent anything in particular. It is just
added to show how the output might appear in a truth table.
A B C X
0 0 0 1
0 0 1 0
0 1 0 1
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 0
1 1 1 1
For the rest of this book, the inputs to a digital circuit will be labeled
with capital letters, A, B, C, etc., while the output will be labeled X.
For some, the hardest part of creating a truth table is being able to
list all possible patterns of ones and zeros for the inputs. One thing that
76 Computer Organization and Design Fundamentals
can help us is that we know that for n inputs, there must be 2n different
patterns of inputs. Therefore, if your truth table doesn't have exactly 2n
rows, then a pattern is either missing or one has been duplicated.
There is also a trick to deriving the combinations. Assume we need
to build a truth table with four inputs, A, B, C, and D. Since 24 = 16, we
know that there will be sixteen possible combinations of ones and
zeros. For half of those combinations, A will equal zero, and for the
other half, A will equal one.
When A equals zero, the remaining three inputs, B, C, and D, will
go through every possible combination of ones and zeros for three
inputs. Three inputs have 23 = 8 patterns, which coincidentally, is half
of 16. For half of the 8 combinations, B will equal zero, and for the
other half, B will equal one. Repeat this for C and then D.
This gives us a process to create a truth table for four inputs. Begin
with the A column and list eight zeros followed by eight ones. Half of
eight is four, so in the B column write four zeros followed by four ones
in the rows where A equals zero, then write four zeros followed by four
ones in the rows where A equals one. Half of four equals two, so the C
column will have two zeros followed by two ones followed by two
zeros then two ones and so on. The process should end with the last
column having alternating ones and zeros. If done properly, the first
row should have all zeros and the last row should have all ones.
A B C D X
0 0 0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
Figure 4-8 Listing All Bit Patterns for a Four-Input Truth Table
Chapter 4: Logic Functions and Gates 77
A X
0 1
1 0
A B X
0 0 0
0 1 0
1 0 0
1 1 1
A B X
0 0 0
0 1 1
1 0 1
1 1 1
The XOR gate's output is set to logic 1 if there are an odd number of
ones being input to the circuit. Figure 4-12 below shows that for a two-
input XOR gate, this occurs twice, once for A=0 and B=1 and once for
A=1 and B=0.
A B X
0 0 0
0 1 1
1 0 1
1 1 0
A B C X
0 X X 0
X 0 X 0
X X 0 0
1 1 1 1
Figure 4-13 Three-Input AND Gate Truth Table With Don't Cares
A similar truth table can be made for the OR gate. In this case, if any
input to an OR gate is one, the output is 1. The only time an OR gate
outputs a 0 is when all of the inputs are set to 0.
Chapter 4: Logic Functions and Gates 79
A
B X
C
Door
Glass
Alarm
Motion
Armed
Figure 4-18 Combinational Logic for a Simple Security System
Figure 4-19 Truth Table for Simple Security System of Figure 4-18
We determined the pattern of ones and zeros for the output column
of the truth table through an understanding of the operation of a
security system. We could have also done this by examining the circuit
itself. Starting at the output side of Figure 4-18 (the right side) the
AND gate will output a one only if both inputs are one, i.e., the system
is armed and the OR gate is outputting a one.
The next step is to see when the OR gate outputs a one. This
happens when any of the inputs, Door, Glass, or Motion, equal one.
From this information, we can determine the truth table. The output of
our circuit is equal to one when Armed=1 AND when either Door OR
Glass OR Motion equal 1. For all other input conditions, a zero should
be in the output column.
There are three combinational logic circuits that are so common that
they are considered gates in themselves. By adding an inverter to the
output of each of the three logic gates, AND, OR, and XOR, three new
combinational logic circuits are created. Figure 4-20 shows the new
logic symbols.
82 Computer Organization and Design Fundamentals
A A
X X
B B
A A
X X
B B
A A
X X
B B
The NAND gate outputs a 1 if any input is a zero. Later in this book,
it will be shown how this gate is in fact a very important gate in the
design of digital circuitry. It has two important characteristics: (1) the
transistor circuit that realizes the NAND gate is typically one of the
fastest circuits and (2) every digital circuit can be realized with
combinational logic made entirely of NAND gates.
The NOR gate outputs a 1 only if all of the inputs are zero. The
Exclusive-NOR gate outputs a 1 as an indication that an even number
of ones is being input to the gate.
A similar method is used to represent inverted inputs. Instead of
inserting the NOT gate symbol in a line going to the input of a gate, a
circle can be placed at the gate's input to show that the signal is
inverted before entering the gate. An example of this is shown in the
circuit on the right in Figure 4-21.
A A
B X B X
C C
A
B X
C
This process can be rather tedious, especially if there are more than
three inputs to the combinational logic. Note that the bit pattern in
Figure 4-23 represents only one row of a truth table with eight rows.
Add another input and the truth table doubles in size to sixteen rows.
There is another way to determine the truth table. Notice that in
Figure 4-23, we took the inputs through a sequence of steps passing it
first through the inverter connected to the input B, then through the
AND gate, then through the OR gate, and lastly through the inverter
connected to the output of the OR gate. These steps are labeled (a), (b),
(c), and (d) in Figure 4-24.
A (b)
(a) (c) (d)
B X
C
A B C
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
Figure 4-25 All Combinations of Ones and Zeros for Three Inputs
Next, add a column for each layer of logic. Going back to Figure 4-
24, we begin by making a column representing the (a) step. Since (a)
represents the output of an inverter that has B as its input, fill the (a)
column with the opposite or inverse of each condition in the B column.
Chapter 4: Logic Functions and Gates 85
A B C (a) = NOT of B
0 0 0 1
0 0 1 1
0 1 0 0
0 1 1 0
1 0 0 1
1 0 1 1
1 1 0 0
1 1 1 0
Next, step (b) is the output of an AND gate that takes as its inputs
step (a) and input A. Add another column for step (b) and fill it with the
AND of columns A and (a).
Step (c) is the output from the OR gate that takes as its inputs step
(b) and the input C. Add another column for (c) and fill it with the OR
of column C and column (b). This is shown in Figure 4-28.
Last of all, Figure 4-29 shows the final output is the inverse of the
output of the OR gate of step (c). Make a final column and fill it with
the inverse of column (c). This will be the final output column for the
truth table.
86 Computer Organization and Design Fundamentals
Problems
1. Identify a real-world example for an AND gate and one for an OR
gate other than those presented in this chapter.
2. How many rows does the truth table for a 4-input logic gate have?
3. Construct the truth table for a four-input OR gate.
4. Construct the truth table for a two-input NAND gate.
5. Construct the truth table for a three-input Exclusive-NOR gate.
6. Construct the truth table for a three-input OR gate using don't cares
for the inputs similar to the truth table constructed for the three-
input AND gate shown in Figure 4-13.
7. Draw the output X for the pattern of inputs shown in the figure
below for a three input NAND gate.
A·B·C
10. Develop the truth table for each of the combinational logic circuits
shown below.
a.) A
B X
C
b.) A
B X
C
c.) A
B
C X
CHAPTER FIVE
Boolean Algebra
A
B X
C
A B C X
0 0 0 1
0 0 1 0
0 1 0 1
0 1 1 0
1 0 0 1
1 0 1 0
1 1 0 0
1 1 1 0
• Both schematics and truth tables take too much space to describe
the operation of complex circuits with numerous inputs.
• The truth table "hides" circuit information.
• The schematic diagram is difficult to use when trying to determine
output values for each input combination.
89
90 Computer Organization and Design Fundamentals
representing the AND, OR, and NOT gates. These boolean expressions
can be used to describe or evaluate the output of a circuit.
There is an additional benefit. Just like algebra, a set of rules exist
that when applied to boolean expressions can dramatically simplify
them. A simpler expression that produces the same output can be
realized with fewer logic gates. A lower gate count results in cheaper
circuitry, smaller circuit boards, and lower power consumption.
If your software uses binary logic, the logic can be represented with
boolean expressions. Applying the rules of simplification will make the
software run faster or allow it to use less memory.
The next section describes the representation of the three primary
logic functions, NOT, AND, and OR, and how to convert
combinational logic to a boolean expression.
The first three operations match the OR function, and if the last
operation is viewed as having a non-zero result instead of the decimal
result of two, it too can be viewed as operating similar to the OR
Chapter 5: Boolean Algebra 91
function. Therefore, the boolean OR function is analogous to the
mathematical function of addition.
A
X=A+B
B
A X=A
1
0 1
1
X = A ⊕ B = A·B + A·B
The next section shows how the boolean operators ·, +, ⊕, and the
NOT bar may be combined to represent complex combinational logic.
A
B X
C
A·B (A · B) + C c) A · B and C go
A
B through the OR
B X
gate which outputs
C (A · B) + C.
A
X=A·B
B
A
X=A·B
B
A·B≠A·B
X = A · D + (A + B + C)
The following steps show the order to evaluate the above expression.
A
3 4
B 2
1 6 X
C
D 5
A B A+B A B B+A
0 0 0+0 = 0 0 0 0+0 = 0
0 1 0+1 = 1 0 1 1+0 = 1
1 0 1+0 = 1 1 0 0+1 = 1
1 1 1+1 = 1 1 1 1+1 = 1
Not only does Figure 5-10 show how the commutative law applies
to the OR function, it also shows how truth tables can be used in
boolean algebra to prove laws and rules. If a rule states that two
boolean expressions are equal, then by developing the truth table for
each expression and showing that the output is equal for all
combinations of ones and zeros at the input, then the rule is proven
true.
Below, the three fundamental laws of boolean algebra are given
along with examples.
The next section uses truth tables and laws to prove twelve rules of
boolean algebra.
A A A
0 1 0
1 0 1
Since the first column and the third column have the same pattern of
ones and zeros, they must be equivalent. Figure 5-11 shows this rule in
schematic form.
A A=A
5.5.2 OR Rules
If an input to a logic gate is a constant 0 or 1 or if the same signal is
connected to more than one input of a gate, a simplification of the
Chapter 5: Boolean Algebra 97
expression is almost always possible. This is true for the OR gate as is
shown with the following four rules for simplifying the OR function.
First, what happens when one of the inputs to an OR gate is a
constant logic 0? It turns out that the logic 0 input drops out leaving
the remaining inputs to stand on their own. Notice that the two columns
in the truth table below are equivalent thus proving this rule.
A A+0
Rule: A + 0 = A 0 0+0 = 0
1 1+0 = 1
A A·0
Rule: A · 0 = 0 0 0·0=0
1 1·0=0
A A·A
Rule: A · A = A 0 0·0=0
1 1·1=1
A A⊕0
Rule: A ⊕ 0 = A 0 0⊕0=0
1 1⊕0=1
If one input of a two-input XOR gate is connected to a logic 1, then
the XOR gate acts as an inverter as shown in the table below.
A A⊕1
Rule: A ⊕ 1 = A 0 0⊕1=1
1 1⊕1=0
A A⊕A
Rule: A ⊕ A = 0 0 0⊕0=0
1 1⊕1=0
Example
Prove that A + A·B = A
100 Computer Organization and Design Fundamentals
Solution
A + A·B = A·1 + A·B Rule: A · 1 = A
= A·(1 + B) Distributive Law
= A·(B + 1) Commutative Law
= A·1 Rule: A + 1 = 1
=A Rule: A · 1 = A
Remember also that rules of boolean algebra can be proven using a
truth table. The example below uses a truth table to derive another rule.
Example
Prove A + A ⋅ B = A + B
Solution
The truth table below goes step-by-step through both sides of the
expression to prove that A + A ⋅ B = A + B .
Example
Prove (A + B)·(A + C) = A + B·C
Chapter 5: Boolean Algebra 101
Solution
(A + B)·(A + C) = (A + B)·A + (A + B)·C Distributive Law
= A·A + B·A + A·C + B·C Distributive Law
= A + B·A + A·C + B·C Rule: A·A = A
= A + A·B + A·C + B·C Commutative Law
= A + A·C + B·C Rule: A + A·B = A
= A + B·C Rule: A + A·B = A
Now that you have a taste for the manipulation of boolean
expressions, the next section will show examples of how complex
expressions can be simplified.
5.6 Simplification
Many students of algebra are frustrated by problems requiring
simplification. Sometimes it feels as if extrasensory perception is
required to see where the best path to simplification lies. Unfortunately,
boolean algebra is no exception. There is no substitute for practice.
Therefore, this section provides a number of examples of simplification
in the hope that seeing them presented in detail will give you the tools
you need to simplify the problems on your own.
The rules of the previous section are summarized in Figure 5-12.
1. A=A 9. A⋅ A = 0
2. A+0 = A 10. A⊕0 = A
3. A +1 = 1 11. A ⊕1 = A
4. A+ A = A 12. A⊕ A = 0
5. A+ A =1 13. A⊕ A =1
6. A⋅0 = 0 14. A + A⋅ B = A
7. A ⋅1 = A 15. A+ A⋅B = A+ B
8. A⋅ A = A 16. ( A + B) ⋅ ( A + C ) = A + B ⋅ C
Solution
From the rules of boolean algebra, we know that (A + B)(A + C) =
A + BC. Substitute A·B for A, C for B, and D for C and we get:
Example
Simplify ( A + B) ⋅ ( B + B)
Solution
Example
Simplify B ⋅ ( A + A ⋅ B)
Solution
_ _ _
B·A + B·A·B Distributive Law
_ _ _
B·A + A·B·B Associative Law
_ _
B·A + A·0 Anything AND'ed with its inverse is 0
_
B·A + 0 Anything AND'ed with 0 is 0
_
B·A Anything OR'ed with 0 is itself
_
A·B Associative Law
Example
Simplify ( A + B) ⋅ ( A + B )
Chapter 5: Boolean Algebra 103
Solution
_ _ _ _
A·A + A·B + B·A + B·B Use FOIL to distribute terms
_ _
0 + A·B + B·A + 0 Anything AND'ed with its inverse is 0
_ _
A·B + B·A Anything OR'ed with 0 is itself
Example
Simplify A ⋅ B ⋅ C + A ⋅ B ⋅ C + A ⋅ B ⋅ C + A ⋅ B ⋅ C
Solution
_ _ _ _ _
A·(B·C + B·C + B·C + B·C) Distributive Law
_ _ _ _
A·(B·(C + C) + B·(C + C)) Distributive Law
_ _
A·(B·1 + B·1) Anything OR'ed with its inverse is 1
_ _
A·(B + B) Anything AND'ed with 1 is itself
_
A·1 Anything OR'ed with its inverse is 1
_
A Anything AND'ed with 1 is itself
AND OR
A B X = A·B A B X = A+B
0 0 0 0 0 0
0 1 0 0 1 1
1 0 0 1 0 1
1 1 1 1 1 1
Okay, so maybe they're not exactly the same, but notice that the
output for each gate is the same for three rows and different for the
104 Computer Organization and Design Fundamentals
fourth. For the AND gate, the row that is different occurs when all of
the inputs are ones, and for the OR gate, the different row occurs when
all of the inputs are zeros. What would happen if we inverted the inputs
of the AND truth table?
The two truth tables are still not quite the same, but they are quite
close. The two truth tables are now inverses of one another. Let's take
the inverse of the output of the OR gate and see what happens.
A+B=A·B A+B=A·B
A
B X
C
A
B X
C
b.) Push inverter through the AND gate distributing it to the inputs
A
B X
C
Problems
1. List three drawbacks of using truth tables or schematics for
describing a digital circuit.
2. List three benefits of a digital circuit that uses fewer gates.
3. True or False: Boolean expressions can be used to optimize the
logic functions in software.
4. Convert the following boolean expressions to their schematic
equivalents. Do not modify the original expression
___
a.) A·B + C
_ _ ___
b.) A·B·C + A·B + A·C
_________ _
c.) (A + B·C) + A·D
_ _
d.) A·B + A·B
a.) A
X
B
108 Computer Organization and Design Fundamentals
b.) A
X
B
C
c.) A
X
B
6.1 Sum-of-Products
A sum-of-products (SOP) expression is a boolean expression in a
specific format. The term sum-of-products comes from the expression's
form: a sum (OR) of one or more products (AND). As a digital circuit,
an SOP expression takes the output of one or more AND gates and
OR's them together to create the final output.
The inputs to the AND gates are either inverted or non-inverted
input signals. This limits the number of gates that any input signal
passes through before reaching the output to an inverter, an AND gate,
and an OR gate. Since each gate causes a delay in the transition from
input to output, and since the SOP format forces all signals to go
through exactly two gates (not counting the inverters), an SOP
expression gives us predictable performance regardless of which input
in a combinational logic circuit changes.
Below is an example of an SOP expression:
_ _ _ _ _
ABCD + ABD + CD + AD
109
110 Computer Organization and Design Fundamentals
There are no parentheses in an SOP expression since they would
necessitate additional levels of logic. This also means that an SOP
expression cannot have more than one variable combined in a term with
an inversion bar. The following is not an SOP expression:
__ _ _ _ _
(AB)CD + ABD + CD + AD
This is because the first term has A and B passing through a NAND
gate before being AND'ed with C and D thereby creating a third level
of logic. To fix this problem, we need to break up the NAND using
DeMorgan's Theorem.
__ _ _ _ _
(AB)CD + ABD + CD + AD
_ _ _ _ _ _
(A + B)CD + ABD + CD + AD
_ _ _ _ _ _ _
ACD + BCD + ABD + CD + AD
A
B
C
D
ABC + BD + CD
Placing a one in each row identified above should result in the truth
table for the corresponding SOP expression. Remember to set the
remaining row outputs to zero to complete the table.
_ _ _ A B C X
X = ABC + AB + ABC 0 0 0 0
0 0 1 0
0 1 0 1
0 1 1 0
1 0 0 1
1 0 1 1
1 1 0 0
1 1 1 1
Figure 6-4 Conversion of an SOP Expression to a Truth Table
Example
Derive the SOP expression for the following truth table.
A B C X
0 0 0 0
0 0 1 1
0 1 0 0
0 1 1 1
1 0 0 1
1 0 1 0
1 1 0 1
1 1 1 0
Solution
First, identify each row that contains a one as the output.
A B C X
0 0 0 0
0 0 1 1 A = 0, B = 0, and C = 1
0 1 0 0
0 1 1 1 A = 0, B = 1, and C = 1
1 0 0 1 A = 1, B = 0, and C = 0
1 0 1 0
1 1 0 1 A = 1, B = 1, and C = 0
1 1 1 0
Now we need to make a product for each of these rows. The product
that outputs a one for the row where A=0, B=0, and C=1 must invert A
and B in order to have a product of 1·1·1 = 1. Therefore, our product is:
_ _
A·B·C
The product that outputs a one for the row where A=0, B=1, and
C=1 must invert A in order to have a product of 1·1·1 = 1. This gives us
our second product:
_
A·B·C
114 Computer Organization and Design Fundamentals
The third product outputs a one for the row where A=1, B=0, and
C=0. Therefore, we must invert B and C in order to have a product of
1·1·1 = 1. _ _
A·B·C
The final product outputs a one for the row where A=1, B=1, and
C=0. This time only C must be inverted.
_
A·B·C
The next three sections parallel the first three for a second standard
boolean expression format: the product-of-sums.
6.4 Product-of-Sums
The product-of-sums (POS) format of a boolean expression is much
like the SOP format with its two levels of logic (not counting
inverters). The difference is that the outputs of multiple OR gates are
combined with a single AND gate which outputs the final result. The
expression below adheres to the format of a POS expression.
_ _ _ _ _
(A+B+C+D)(A+B+D)(C+D)(A+D)
Example
Convert the following POS expression to a truth table.
_ _ _
(A+B+C)(A+B)(A+B+C)
Solution
The first step is to determine where each sum equals zero.
Beginning with the first term, the three different conditions for a zero
output are listed below.
_ _ _ _
A+B+C = 0 when A=0, B=0, and C=0,
which means when A=1, B=0, and C=1.
118 Computer Organization and Design Fundamentals
_ _
A+B = 0 when A=0, B=0, and C=1 or 0,
which means when A=0, B=1, and C=0 or 1, i.e.,
two rows will have a zero output due to this term.
Placing a zero in each row identified above should result in the truth
table for the corresponding POS expression. Remember to set the
remaining row outputs to zero to complete the table.
_ _ _ A B C X
X = (A+B+C)(A+B)(A+B+C) 0 0 0 0
0 0 1 1
0 1 0 0
0 1 1 0
1 0 0 1
1 0 1 0
1 1 0 1
1 1 1 1
Next, make a sum for each of these rows. Remember that a sum
outputs a zero when all of its inputs equal zero. Therefore, to make a
sum for a row equal to zero, the inputs equal to one must be inverted.
For the first row, a zero is output when A=0, B=0, and C=0. Since for
this case, all of the inputs are already zero, simply OR the non-inverted
inputs together.
A+B+C
The sum that outputs a zero for the row where A=0, B=1, and C=0
must invert B in order to have a sum of 0+0+0=0. This gives us our
second sum: _
A+B+C
The third sum outputs a zero for the row where A=1, B=0, and C=1.
Therefore, we must invert A and C in order to have a sum of 0+0+0=0.
_ _
A+B+C
The final sum outputs a zero for the row where A=1, B=1, and C=1.
In this case, all of the inputs must be inverted to get the sum 0+0+0=0.
_ _ _
A+B+C
A+B=A·B
A+B=A·B
Figure 6-10 depicts DeMorgan's Theorem with circuit diagrams.
___ _ _
A+B = A·B
A A
B ___ _ _ B
A·B = A+B
A A
B B
A
B
C
A
B
C
A
B
C
Since an SOP expression can be created for any truth table, then any
truth table can be implemented entirely with NAND gates. This allows
a designer to create an entire digital system from a single type of gate
resulting in a more efficient use of the hardware.
There is an additional benefit to this observation. For most of the
technologies used to implement digital logic, the NAND gate is the
fastest available gate. Therefore, if a circuit can maintain the same
structure while using the fastest possible gate, the overall circuit will be
faster.
Another way to make digital circuits faster is to reduce the number
of gates in the circuit or to reduce the number of inputs to each of the
circuit's gates. This will not only benefit hardware, but also any
software that uses logic processes in its operation. Chapter 7 presents a
graphical method for generating optimally reduced SOP expressions
without using the laws and rules of boolean algebra.
2. If a POS expression uses five input variables and has a sum within
it that uses only three variables, how many rows in the POS
expression's truth table have zeros as a result of that sum?
3. Draw the digital circuit corresponding to the following SOP
expressions.
_ _ _ _ _
a.) A·B·D·E + A·B·C·D
_ _ _ _ _ _
b.) A·B·C + A·B·C + A·B·C + A·B·C
_ _ _
c.) A·C + B·C·E + D·E
4. Draw the NAND-NAND digital logic circuit for each of the SOP
expressions shown in problem 3.
5. List the two reasons why the NAND-NAND implementation of an
SOP expression is preferred over an AND-OR implementation.
6. Put the following boolean expression into the proper SOP format.
___ _ _ ___
A·B·C + A·B·C + A·B·C
124 Computer Organization and Design Fundamentals
7. Which form of boolean expression, SOP or POS, would best be
used to implement the following truth table?
A B C X
0 0 0 1
0 0 1 1
0 1 0 0
0 1 1 1
1 0 0 1
1 0 1 0
1 1 0 1
1 1 1 0
8. Derive the SOP and POS expressions for each of the truth tables
shown below.
a.) b.) c.)
A B C X A B C X A B C X
0 0 0 0 0 0 0 0 0 0 0 1
0 0 1 1 0 0 1 1 0 0 1 0
0 1 0 0 0 1 0 0 0 1 0 1
0 1 1 0 0 1 1 1 0 1 1 1
1 0 0 0 1 0 0 1 1 0 0 1
1 0 1 0 1 0 1 0 1 0 1 0
1 1 0 1 1 1 0 0 1 1 0 1
1 1 1 1 1 1 1 1 1 1 1 0
9. Derive the truth table for each of the following SOP expressions.
_ _ _ _ _
a.) A·B·C + A·B·C + A·B·C
_ _ _ _
b.) A + A·B·C + A·B·C
_ _ _ _ _
c.) A·C + A·B·C + A·B·C
10. Derive the truth table for each of the following POS expressions.
_ _ _ _ _
a.) (A+B+C)·(A+B+C)·(A+B+C)
_ _
b.) (A + B)·(A + C)
_ _ _ _ _
c.) (A+C)·(A+B+C)·(A+B+C)
CHAPTER SEVEN
Karnaugh Maps
A B C X
0 0 0 1 A = 0, B = 0, and C = 0
0 0 1 1 A = 0, B = 0, and C = 1
0 1 0 0
0 1 1 0
1 0 0 1 A = 1, B = 0, and C = 0
1 0 1 0
1 1 0 1 A = 1, B = 1, and C = 0
1 1 1 0
The application of the rule stating that OR'ing anything with its
inverse results in a one (the third line in the above simplification) is the
most common way to simplify an SOP expression. This chapter
presents a graphical method to quickly pair up products where this rule
can be applied in order to drop out as many terms as possible.
125
126 Computer Organization and Design Fundamentals
Karnaugh Maps are graphical representations of truth tables. They
consist of a grid with one cell for each row of the truth table. The grid
shown below in Figure 7-1 is the two-by-two Karnaugh map used to
represent a truth table with two input variables.
B
A 0 1
0
1
A B X B
0 0 S0 A 0 1
0 1 S1 0 S0 S1
1 0 S2 1 S2 S3
1 1 S3
C
AB 0 1
00
01
11
10
CD
AB 00 01 11 10
00
01
11
10
Example
Convert the three-input truth table below to its corresponding
Karnaugh map.
A B C X
0 0 0 0
0 0 1 1
0 1 0 0
0 1 1 1
1 0 0 1
1 0 1 0
1 1 0 1
1 1 1 1
Solution
The three-input Karnaugh map uses the two-by-four grid shown in
Figure 7-3. It doesn't matter which row is used to begin the transfer.
Typically, we begin with the top row.
Chapter 7: Karnaugh Maps 129
A B C X
0 0 0 0 C
0 0 1 1 AB 0 1
0 1 0 0
00 0
0 1 1 1
1 0 0 1 01
1 0 1 0 11
1 1 0 1 10
1 1 1 1
C
AB 0 1
00 0 1
01 0 1
11 1 1
10 1 0
If this was all that Karnaugh maps could be used for, they wouldn't
be of any use to us. Notice, however, that adjacent cells, either
horizontally or vertically adjacent, differ by only one inversion. For
example, the top right cell and the cell below it are identical except that
B is inverted in the top cell and not inverted in the bottom cell. This
implies that we can combine these two products into a single, simpler
product depending only on A and C.
_ _ _ _ _ _
(A·B·C)+(A·B·C) = A·C·(B + B) = A·C
The third row in the Karnaugh map in Figure 7-5 has another
adjacent pair of products.
_ _
(A·B·C)+(A·B·C) = A·B·(C + C) = B·C
Figure 7-6 Karnaugh Map with Four Adjacent Cells Containing '1'
Chapter 7: Karnaugh Maps 131
By applying the rules of boolean algebra, we can see how the
products represented by these four cells reduce to a single product with
only one variable.
_ _ _ _
X = A·B·C + A·B·C + A·B·C + A·B·C
_ _ _
X = A·B·(C + C) + A·B·(C + C)
_
X = A·B + A·B
_
X = B·(A + A)
X = B
Right Wrong
• All cells in a rectangle must contain ones. No zeros are allowed.
CD CD
AB 00 01 11 10 AB 00 01 11 10
00 1 0 0 0 00 0 0 0 0
01 1 0 1 1 01 0 1 1 0
11 1 0 1 1 11 0 1 0 0
10 1 0 0 0 10 0 0 0 0
Right Wrong
132 Computer Organization and Design Fundamentals
• The number of cells in the grouping must equal a power of two, i.e.,
only groups of 1, 2, 4, 8, or 16 are allowed.
CD CD
AB 00 01 11 10 AB 00 01 11 10
00 1 0 1 1 00 1 1 0 0
01 0 0 1 1 01 1 1 0 0
11 0 1 1 1 11 1 1 0 0
10 0 0 1 1 10 0 0 0 0
Right Wrong
Right Wrong
Right Wrong
Chapter 7: Karnaugh Maps 133
Right Wrong
C
AB 0 1
00 0 0
01 1 1
11 1 1
10 0 0
The inputs that are the same for all cells in the rectangle are the ones
that will be used to represent the product. For this example, both A and
C are 0 for some cells and 1 for others. That means that these inputs
will drop out leaving only B which remains 1 for all four of the cells.
Therefore, the product for this rectangle will equal 1 when B equals 1
134 Computer Organization and Design Fundamentals
giving us the same expression we got from simplifying the Figure 7-6
equation:
X=B
Example
Determine the minimal SOP expression for the truth table below.
A B C D X
0 0 0 0 1
0 0 0 1 0
0 0 1 0 1
0 0 1 1 0
0 1 0 0 0
0 1 0 1 0
0 1 1 0 0
0 1 1 1 0
1 0 0 0 1
1 0 0 1 1
1 0 1 0 1
1 0 1 1 1
1 1 0 0 0
1 1 0 1 0
1 1 1 0 1
1 1 1 1 1
Solution
First, we need to convert the truth table to a Karnaugh map.
CD
AB 00 01 11 10
00 1 0 0 1
01 0 0 0 0
11 0 0 1 1
10 1 1 1 1
Chapter 7: Karnaugh Maps 135
Now that we have the Karnaugh map, it is time to create the
rectangles. Two of them are easy to see: all four cells of the bottom row
make one rectangle and the four cells that make up the lower right
corner quadrant of the map make up a second rectangle.
Less obvious is the third rectangle that takes care of the two cells in
the top row that contain ones. Remember that a rectangle can wrap
from the left side of the map to the right side of the map. That means
that these two cells are adjacent. What's less obvious is that this
rectangle can wrap around from top to bottom too making it a four cell-
rectangle.
CD Rectangle 3
AB 00 01 11 10
00 1 0 0 1
Rectangle 1 01 0 0 0 0 Rectangle 2
11 0 0 1 1
10 1 1 1 1
It's okay to have the bottom right cell covered by three rectangles.
The only requirement is that no rectangle can be fully covered by other
rectangles and that no cell with a 1 be left uncovered.
Now let's figure out the products of each rectangle. Below is a list of
the input values for rectangle 1, the one that makes up the bottom row
of the map.
Rectangle 1: A B C D
1 0 0 0
1 0 0 1
1 0 1 1
1 0 1 0
A and B are the only inputs to remain constant for all four cells: A is
always 1 and B is always 0. This means that for this rectangle, the
product must output a one when A equals one and B equals zero, i.e.,
the inverse of B equals one. This gives us our first product.
_
Product for rectangle 1 = A·B
136 Computer Organization and Design Fundamentals
The product for rectangle 2 is found the same way. Below is a list of
the values for A, B, C, and D for each cell of rectangle 2.
Rectangle 2: A B C D
1 1 1 1
1 1 1 0
1 0 1 1
1 0 1 0
In this rectangle, A and C are the only ones that remain constant
across all of the cells. To make the corresponding product equal to one,
they must both equal one.
Rectangle 3: A B C D
0 0 0 0
0 0 1 0
1 0 0 0
1 0 1 0
In rectangle 3, B and D are the only ones that remain constant across
all of the cells. To make the corresponding product equal to one, they
must both equal 0, i.e., their inverses must equal 1.
_ _
Product for rectangle 3 = B·D
A A A J 2
♣ ♥ ♦
♠ ♦
Three aces are pretty good, but since you can change the jack of
diamonds to anything you want, you could make the hand better.
Changing it to a two would give you a full house: three of a kind and a
pair. Changing it to an ace, however, would give you an even better
hand, one beatable by only a straight flush or five of a kind. (Note that
five of a kind is only possible with wild cards.)
If a truth table contains a "don't care" element as its output for one of
the rows, that "don't care" is transferred to corresponding cell of the
Karnaugh map. The question is, do we include the "don't care" in a
rectangle or not? Well, just like the poker hand, you do what best suits
the situation.
For example, the four-input Karnaugh map shown in Figure 7-8
contains two "don't care" elements: one represented by the X in the far
right cell of the second row and one in the lower left cell.
138 Computer Organization and Design Fundamentals
CD
AB 00 01 11 10
00 1 0 0 0
01 1 0 0 X
11 1 0 1 1
10 X 0 0 0
CD
AB 00 01 11 10
00 1 0 0 0
01 1 0 0 X
11 1 0 1 1
10 X 0 0 0
The final circuit will have a one or a zero in that position depending
on whether or not it was included in a rectangle. Later in this book, we
will examine some cases where we will want to see if those values that
were assigned to "don't cares" by being included or not included in a
rectangle could cause a problem.
Problems
1. How many cells does a 3-input Karnaugh map have?
2. What is the largest number of input variables a Karnaugh map can
handle and still remain two-dimensional?
3. In a 4-variable Karnaugh map, how many input variables (A, B, C,
or D) does a product have if its rectangle of 1's contains 4 cells?
Your answer should be 0, 1, 2, 3, or 4.
4. Identify the problems with each of the three rectangles in the
Karnaugh map below.
CD
AB
Rectangle 1 1 1 0 0
1 1 1 1
Rectangle 2
1 1 0 1
Rectangle 3 0 0 0 0
C CD
AB 0 1 AB 00 01 11 10
00 0 1 00 0 0 0 0
01 1 1 01 1 1 0 X
11 1 1 11 1 X 1 1
10 0 1 10 X 0 0 0
8. Create a Karnaugh map that shows there can be more than one
arrangement for the rectangles of ones in a Karnaugh map.
CHAPTER EIGHT
Combinational Logic Applications
Thus far, our discussion has focused on the theoretical design issues
of computer systems. We have not yet addressed any of the actual
hardware you might find inside a computer. This chapter changes that.
The following sections present different applications used either as
stand-alone circuits or integrated into the circuitry of a processor. Each
section will begin with a definition of a problem to be addressed. From
this, a truth table will be developed which will then be converted into
the corresponding boolean expression and finally a logic diagram.
8.1 Adders
Most mathematical operations can be handled with addition. For
example, subtraction can be performed by taking the two's complement
of a binary value, and then adding it to the binary value from which it
was to be subtracted. Two numbers can be multiplied using multiple
additions. Counting either up or down (incrementing or decrementing)
can be performed with additions of 1 or -1.
Chapter 3 showed that binary addition is performed just like decimal
addition, the only difference being that decimal has 10 numerals while
binary has 2. When adding two digits in binary, a result greater than
one generates an "overflow", i.e., a one is added to the next position.
This produces a sum of 0 with a carry of 1 to the next position.
1
0 0 1 1
+ 0 + 1 + 0 + 1
0 1 1 10
A Sum
Inputs Outputs
B Carryout
With two inputs, there are four possible patterns of ones and zeros.
0 A Sum 0 1 A Sum 1
0 B Carryout 0 0 B Carryout 0
0 + 0 = 0 w/no carry 1 + 0 = 1 w/no carry
0 A Sum 1 1 A Sum 0
1 B Carryout 0 1 B Carryout 1
0 + 1 = 1 w/no carry 1 + 1 = 0 w/a carry
Figure 8-3 Four Possible States of a Half Adder
A truth table can be derived from Figure 8-3 from which the boolean
expressions can be developed to realize this system.
A B Sum Carryout
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1
Note that the output Sum is also equivalent to the 2-input XOR gate.
Chapter 8: Combinational Logic Applications 143
For Carryout, the output equals 1 only when both A and B are equal
to one. This matches the operation of the AND gate.
Carryout = A·B
Figure 8-4 presents the logic circuit for the half adder.
Sum
A
Carryout
B
The half-adder works fine if we're trying to add two bits together, a
situation that typically occurs only in the rightmost column of a multi-
bit addition. The remaining columns have the potential of adding a
third bit, the carry from a previous column.
For example, assume we want to add two 1 1 1
four bit numbers, A = 01102 and B = 10112. 0 1 1 0
The addition would go something like that + 1 0 1 1
1 0 0 0 1
shown to the right.
Adding the least significant bits of a multi-bit value uses the half-
adder described above. Each input to the half-adder takes one of the
least significant bits from each number. The outputs are the least
significant digit of the sum and a possible carry to the next column.
What is needed for the remaining columns is an adder similar to the
half-adder that can add two bits along with a carry from the previous
column to produce a Sum and the Carryout to the next column. Figure
8-5 represents this operation where An is the bit in the nth position of A,
Bn is the bit in the nth position of B, and Sn is the bit in the nth position
in the resulting sum, S.
Notice that a Carryout from the addition of a pair of bits goes into the
carry input of the adder for the next bit. We will call the input Carryin.
This implies that we need to create a circuit that can add three bits, An,
Bn, and Carryin from the n-1 position. This adder has two outputs, the
144 Computer Organization and Design Fundamentals
sum and the Carryout to the n+1 position. The resulting circuit is called
a full adder. A block diagram of the full adder is shown in Figure 8-6.
A3 A2 A1 A0
B3 B2 B1
B0
Carryin Carryout
S4 S3 S2 S1 S0
A Sum
Inputs Outputs
B Carryout
Carryin
With three inputs there are 23 = 8 possible patterns of ones and zeros
that could be input to our full adder. Table 8-1 lists these combinations
along with the results of their addition which range from 0 to 310.
Inputs Result
A B Carryin Decimal Binary
0 0 0 010 002
0 0 1 110 012
0 1 0 110 012
0 1 1 210 102
1 0 0 110 012
1 0 1 210 102
1 1 0 210 102
1 1 1 310 112
Chapter 8: Combinational Logic Applications 145
The two-digit binary result in the last column of this table can be
broken into its components, the sum and a carry to the next bit position.
This gives us two truth tables, one for the Sum and one for the Carryout.
Table 8-2 Sum and Carryout Truth Tables for a Full Adder
With three inputs, a Karnaugh map can be use to create the logic
expressions. One Karnaugh map will be needed for each output of the
circuit. Figure 8-7 presents the Karnaugh maps for the Sum and the
Carryout outputs of our full adder where Cin represents the Carryin input.
Sum Carryout
Cin Cin
AB 0 1 AB 0 1
00 0 1 00 0 0
01 1 0 01 0 1
11 0 1 11 1 1
10 1 0 10 0 1
Figure 8-7 Sum and Carryout Karnaugh Maps for a Full Adder
Rectangle 1: A B Cin
0 1 1 B·Cin
1 1 1
Rectangle 2: A B Cin
1 1 0 A·B
1 1 1
Rectangle 3: A B Cin
1 1 1 A·Cin
1 0 1
B Carryout
Sum
Now we have the building blocks to create an adder of any size. For
example, a 16-bit adder is made by using a half adder for the least
Chapter 8: Combinational Logic Applications 147
significant bit followed by fifteen full adders daisy-chained through
their carries for the remaining fifteen bits.
This method of creating adders has a slight drawback, however. Just
as with the addition of binary numbers on paper, the sum of the higher-
order bits cannot be determined until the carry from the lower-order
bits has been calculated and propagated through the higher stages.
Modern adders use additional logic to predict whether the higher-order
bits should expect a carry or not well before the sum of the lower-order
bits is calculated. These adders are called carry look ahead adders.
To make a digit appear, the user must know which segments to turn
on and which to leave off. For example, to display a '1', we need to turn
on segments b and c and leave the other segments off. This means that
the binary circuits driving segments b and c would output 1 while the
binary circuits driving segments a, d, e, f, and g would output 0. If the
148 Computer Organization and Design Fundamentals
binary inputs to the display are set to a=1, b=1, c=0, d=1, e=1, f=0, and
g=1, a '2' would be displayed.
a
f b
e g
c
d
Figure 8-11 A Seven-Segment Display Displaying a Decimal '2'
a
Seven digital b
A One output
c
One binary B logic circuits, d for each
nibble C one for each segment
e
D
output f
g
To begin with, we need seven truth tables, one for the output of each
circuit. The individual bits of the number to be displayed will be used
for the inputs. Next, we need to know which segments are to be on and
which are to be off for each digit. Figure 8-13 shows the bit patterns for
each hexadecimal digit.
Using the information from Figure 8-13, we can build the seven
truth tables. The truth table in Figure 8-14 combines all seven truth
tables along with a column indicating which digit is displayed for the
corresponding set of inputs. Note that the capital letters denote the
input signals while the lower case letters identify the segments of the
seven-segment display.
Chapter 8: Combinational Logic Applications 149
Digit Segments Digit Segments
0 a, b, c, d, e, f 1 b, c
2 a, b, d, e, g 3 a, b, c, d, g
4 b, c, f, g 5 a, c, d, f, g
6 a, c, d, e, f, g 7 a, b, c
8 a, b, c, d, e, f, g 9 a, b, c, d, f, g
A a, b, c, e, f, g B c, d, e, f, g
C a, d, e, f D b, c, d, e, g
E a, d, e, f, g F a, e, f, g
CD
AB 00 01 11 10
00 1 0 0 1
01 0 0 0 1
11 1 1 1 1
10 1 0 1 1
Rectangle 3
Rectangle 1: A B C D _ _
0 0 0 0 Product: B·D
1 0 0 0
0 0 1 0
1 0 1 0
Rectangle 2: A B C D
1 1 0 0
1 1 0 1 Product: A·B
1 1 1 1
1 1 1 0
Chapter 8: Combinational Logic Applications 151
Rectangle 3: A B C D
1 1 1 1
1 1 1 0 Product: A·C
1 0 1 1
1 0 1 0
Rectangle 4: A B C D
0 0 1 0 _
0 1 1 0 Product: C·D
1 1 1 0
1 0 1 0
Figure 8-17 presents the digital logic that would control segment e
of the seven-segment display. The design of the display driver is not
complete, however, as there are six more logic circuits to design.
A
B
Segment e
C
D
A DATA
B EN
C
8.4 Decoders
One application where digital signals are used to enable a device is
to identify the unique conditions to enable an operation. For example,
the magnetron in a microwave is enabled only when the timer is
running and the start button is pushed and the oven door is closed.
This method of enabling a device based on the condition of a
number of inputs is common in digital circuits. One common
application is in the processor’s interface to memory. It is used to
determine which memory device will contain a piece of data.
In the microwave example, the sentence used to describe the
enabling of the magnetron joined each of the inputs with the word
Chapter 8: Combinational Logic Applications 153
"and". Therefore, the enabling circuit for the magnetron should be
realized with an AND gate as shown in Figure 8-19.
Timer
Start button Enable magnetron
Door closed
There are many other types of digital systems that enable a process
based on a specific combination of ones and zeros from multiple inputs.
For example, an automobile with a manual transmission enables the
starter when the clutch is pressed and the ignition key is turned. A
vending machine delivers a soda when enough money is inserted and a
button is pushed and the machine is not out of the selected soda.
Correct money
Soda is selected Deliver a soda
Soda empty
An AND gate outputs a one only when all of its inputs equal one. If
one or more inputs are inverted, the output of the AND gate is one if
and only if all of the inputs without inverters equal one and all of the
inputs with inverters equal zero.
The truth table for this type of circuit will have exactly one row with
an output of one while all of the other rows output a zero. The row with
the one can change depending on which inputs are inverted. For
example, Figure 8-21 presents the truth table for the circuit that enables
a device when A and B are true but C is false.
When SOP expressions were introduced in Chapter 6, we found that
each row of a truth table with a '1' output corresponded to a unique
product. Therefore, the circuit that is used to enable a device can be
realized with a single AND gate. The conditions that activate that AND
gate are governed by the pattern of inverters at its inputs. When we
apply the tools of Chapter 6 to the truth table in Figure 8-21, we get the
boolean expression EN = A ⋅ B ⋅ C .
154 Computer Organization and Design Fundamentals
A B C EN
0 0 0 0
0 0 1 0
0 1 0 0
0 1 1 0
1 0 0 0
1 0 1 0
1 1 0 1
1 1 1 0
Figure 8-21 Truth Table to Enable a Device for A=1, B=1, & C=0
This two-input circuit is called a 1-of-4 decoder due to the fact that
exactly one of its four outputs will be enabled at any one time. A
change at any of the inputs will change which output is enabled, but
never change the fact that only one is enabled. As for the logic circuit,
it has four AND gates, one satisfying each of the above boolean
expressions. Figure 8-22 presents this digital circuit.
EN0
A
EN1
B EN2
EN3
EN0
A
EN1
B EN2
EN3
8.5 Multiplexers
A multiplexer, sometimes referred to as a MUX, is a device that
uses a set of control inputs to select which of several data inputs is to be
156 Computer Organization and Design Fundamentals
connected to a single data output. With n binary "select lines," one of 2n
data inputs can be connected to the output. Figure 8-25 presents a block
diagram of a multiplexer with three select lines, S2, S1, and S0, and
eight data lines, D0 through D7.
S2 S1 S0
D0
D1
D2
D3 Y Output
D4
D5
D6
D7
S2 S1 S0 Y
0 0 0 D0
0 0 1 D1
0 1 0 D2
0 1 1 D3
1 0 0 D4
1 0 1 D5
1 1 0 D6
1 1 1 D7
Example
For the multiplexer shown below, sketch the output waveform Y for
the inputs S1 and S0 shown in the graph next to it. Assume S1 is the
most significant bit.
Starts as S1
0 D0
logic '0'
1 D1
0 D2 S0
1 D3 Y
Starts as
S1 logic '1'
S0
Y
Solution
The decimal equivalent to the binary value input by the selector
inputs indicates the subscript of the channel being connected to the
output. For example, when S1 equals one and S0 equals zero, then their
decimal equivalent is 102 = 210. Therefore, D2 is connected to the
output. Since D2 equals zero, then Y is outputting a zero.
The graph below shows the values of Y for each of the states of S1
and S0. The labels inserted above the waveform for Y indicate which
channel is connected to Y at that time.
Starts as S1
logic '0'
S0
Starts as D1 D1 D3 D2 D0 D0
logic '1'
Y
8.6 Demultiplexers
The previous section described how multiplexers select one channel
from a group of input channels to be sent to a single output.
Demultiplexers take a single input and select one channel out of a
group of output channels to which it will route the input. It's like having
158 Computer Organization and Design Fundamentals
multiple printers connected to a computer. A document can only be
printed to one of the printers, so the computer selects one out of the
group of printers to which it will send its output.
The design of a demultiplexer is much like the design of a decoder.
The decoder selected one of many outputs to which it would send a
zero. The difference is that the demultiplexer sends data to that output
rather than a zero.
The circuit of a demultiplexer is based on the non-active-low
decoder where each output is connected to an AND gate. An input is
added to each of the AND gates that will contain the demultiplexer's
data input. If the data input equals one, then the output of the AND gate
that is selected by the selector inputs will be a one. If the data input
equals zero, then the output of the selected AND gate will be zero.
Meanwhile, all of the other AND gates output a zero, i.e., no data is
passed to them. Figure 8-27 presents a demultiplexer circuit with two
selector inputs.
D0
S1
D1
S0
D2
D3
Data
In effect, the select lines, S0, S1, … Sn, "turn on" a specific AND
gate that passes the data through to the selected output. In Figure
8-27, if S1=0 and S0=1, then the D1 output will match the input from the
Data line and outputs D0, D2, and D3 will be forced to have an output of
zero. If S1=0, S0=1, and Data=0, then D1=0. If S1=0, S0=1, and Data=1,
then D1=1. Figure 8-28 presents the truth table for the 1-line-to-4-line
demultiplexer shown in Figure 8-27.
Chapter 8: Combinational Logic Applications 159
S1 S0 Data D0 D1 D2 D3
0 0 0 0 0 0 0
0 0 1 1 0 0 0
0 1 0 0 0 0 0
0 1 1 0 1 0 0
1 0 0 0 0 0 0
1 0 1 0 0 1 0
1 1 0 0 0 0 0
1 1 1 0 0 0 1
Connecting the metal pins of these chips with other metal pins from
the same chip or additional chips is what allows us to create digital
circuits.
As for what we are connecting to them, the metal pins of the ICs
allow us access to the internal circuitry such as the inputs and outputs
of logic gates. Detailed information is available for all ICs from the
manufacturer allowing designers to understand the internal circuitry.
160 Computer Organization and Design Fundamentals
The documentation defining the purpose of each pin of the IC is usually
referred to as the IC's "pin-out description." It provides information not
only on the digital circuitry, but also any power requirements needed to
operate the IC.
Figure 8-30 presents an example of the pin-out of a quad dual-input
NAND gate chip, commonly referred to as a 7400.
Vcc
14 13 12 11 10 9 8
1 2 3 4 5 6 7
Gnd
Note that the pins are numbered. In order to properly use one of
these ICs, you must be able to identify the pin numbers. To help you do
this, the manufacturers identify the first pin, referred to as "pin 1", on
every IC. The Figure 8-31 presents some of the ways this pin is
identified.
The pins are then numbered counter-clockwise around the chip. You
can see this in the numbering of the pins in Figure 8-30.
Many circuits are then built and tested using prototype boards or
protoboards. A protoboard is a long, thin plastic board with small holes
in it that allow ICs and short wire leads to be plugged in. A generic
protoboard is shown in Figure 8-32.
Chapter 8: Combinational Logic Applications 161
The next step is to add input and output that will allow us to
communicate with our circuit. The simplest output from a digital circuit
is an LED. Figure 8-35 presents the schematic symbol of an LED.
+5 V
+5 V
Pull-up Resistor
Input to
an IC
IC Output
Any local electronics store should carry the protoboards, ICs, input
switches, and output LEDs to create your prototype circuits. By using
some simple circuits for switches and LEDs and the design principles
outlined in this book, you can begin creating digital circuits of your
own.
Problems
1. Design the digital logic for segments c, f, and g of the seven-
segment display driver truth table in Figure 8-14.
2. Draw the decoding logic circuit with an active-low output for the
inputs A = 1, B = 1, C = 0, and D = 1.
3. For the active-low output decoder shown
to the right, fill in the values for the D0
outputs D0 through D3. Assume S1 is the 0 S1 D1
1 S0 D2
most significant bit.
D3
0 S1
1 S0
5. What is the purpose of the resistor in the digital circuit for the LED
shown in Figure 8-36?
6. What is the purpose of the resistor in the digital circuit for the
switch shown in Figure 8-37?
CHAPTER NINE
Binary Operation Applications
if(((iVal/2)*2) == iVal)
// This code is executed for even values
else
// This code is executed for odd values
165
166 Computer Organization and Design Fundamentals
3510 = 001000112 12410 = 011111002
9310 = 010111012 3010 = 000111102
Value 1 Value 2
Result
Value 1 0 1 1 0 1 0 1 1
Value 2 1 1 0 1 1 0 1 0
Resulting AND 0 1 0 0 1 0 1 0
Remember that the output of an AND is one if and only if all of the
inputs are one. In Figure 9-2, we see that ones only appear in the result
in columns where both of the original values equal one. In a C program,
the bitwise AND is identified with the operator '&'. The example in
Figure 9-2 can then be represented in C with the following code.
Chapter 9: Binary Operation Applications 167
int iVal1 = 0b01101011;
int iVal2 = 0b11011010;
int result = iVal1 & iVal2;
3510 (odd) 0 0 1 0 0 0 1 1
Odd/Even Mask 0 0 0 0 0 0 0 1
Bitwise AND Result 0 0 0 0 0 0 0 1
9310 (odd) 0 1 0 1 1 1 0 1
Odd/Even Mask 0 0 0 0 0 0 0 1
Bitwise AND Result 0 0 0 0 0 0 0 1
3010 (even) 0 0 0 1 1 1 1 0
Odd/Even Mask 0 0 0 0 0 0 0 1
Bitwise AND Result 0 0 0 0 0 0 0 0
if(!(iVal&0b00000001))
// This code is executed for even values
else
// This code is executed for odd values
The bitwise AND can also be used to clear specific bits. For
example, assume we want to separate the nibbles of a byte into two
different variables. The following process can be used to do this:
• Copy the original value to the variable meant to store the lower
nibble, then clear all but the lower four bits
• Copy the original value to the variable meant to store the upper
nibble, then shift the value four bits to the right. (See Section 3.7,
"Multiplication and Division by Powers of Two," to see how to
shift right using C.) Lastly, clear all but the lower four bits.
Example
Using bitwise operations, write a function in C that determines if an
IPv4 address is a member of the subnet 192.168.12.0 with a subnet
mask 255.255.252.0. Return a true if the IP address is a member and
false otherwise.
Solution
An IPv4 address consists of four bytes or octets separated from one
another with periods or "dots". When converted to binary, an IPv4
address becomes a 32 bit number.
The address is divided into two parts: a subnet id and a host id. All
of the computers that are connected to the same subnet, e.g., a company
or a school network, have the same subnet id. Each computer on a
subnet, however, has a unique host id. The host id allows the computer
to be uniquely identified among all of the computers on the subnet.
The subnet mask identifies the bits that represent the subnet id.
When we convert the subnet mask in this example, 255.255.252.0, to
binary, we get 11111111.11111111.11111100.00000000.
The bits that identify the subnet id of an IP address correspond to the
positions with ones in the subnet mask. The positions with zeros in the
subnet mask identify the host id. In this example, the first 22 bits of any
IPv4 address that is a member of this subnet should be the same,
170 Computer Organization and Design Fundamentals
specifically they should equal the address 192.168.12.0 or in binary
11000000.10101000.00001100.00000000.
So how can we determine if an IPv4 address is a member of this
subnet? If we could clear the bits of the host id, then the remaining bits
should equal 192.168.12.0. This sounds like the bitwise AND. If we
perform a bitwise AND on an IPv4 address of this subnet using the
subnet mask 255.255.252.0, then the result must be 192.168.12.0
because the host id will be cleared. Let's do this by hand for one
address inside the subnet, 192.168.15.23, and one address outside the
subnet, 192.168.31.23. First, convert these two addresses to binary.
192.168.15.23 = 11000000.10101000.00001111.00010111
192.168.31.23 = 11000000.10101000.00011111.00010111
IP Address 11000000.10101000.00001111.00010111
Subnet mask 11111111.11111111.11111100.00000000
Bitwise AND 11000000.10101000.00001100.00000000
IP Address 11000000.10101000.00011111.00010111
Subnet mask 11111111.11111111.11111100.00000000
Bitwise AND 11000000.10101000.00011100.00000000
Notice that the result of the first bitwise AND produces the correct
subnet address while the second bitwise AND does not. Therefore, the
first address is a member of the subnet while the second is not.
The code to do this is shown below. It assumes that the type int is
defined to be at least four bytes long. The left shift operator '<<' used in
the initialization of sbnt_ID and sbnt_mask pushes each octet of
the IP address or subnet mask to the correct position.
Original value 1 0 0 1 0 1 1 0
Mask 0 0 1 0 1 0 1 0
Bitwise OR 1 0 1 1 1 1 1 0
Example
Assume that a control byte is used to control eight sets of lights in
an auditorium. Each bit controls a set of lights as follows:
For example, if the house lighting, exit lighting, and stage lighting
are all on, the value of the control byte should be 100101002. What
mask would be used with the bitwise OR to turn on the aisle lighting
and the emergency lighting?
Solution
The bitwise OR uses a mask where a one is in each position that
needs to be turned on and zeros are placed in the positions meant to be
left alone. To turn on the aisle lighting and emergency lighting, bits 5
and 3 must be turned on while the remaining bits are to be left alone.
This gives us a mask of 001010002.
A B X
0 0 0
0 1 1
1 0 1
1 1 0
If we cover up the bottom two rows of this truth table leaving only
the rows where A=0 visible, we see that the value of B is passed along
to X, i.e., if A=0, then X equals B. If we cover up the rows where A=0
leaving only the rows where A=1 visible, it looks like the inverse of B
is passed to X, i.e., if A=1, then X equals the inverse of B. This
discussion makes a two-input XOR gate look like a programmable
inverter. If A is zero, B is passed through to the output untouched. If A
is one, B is inverted at the output.
Therefore, if we perform a bitwise XOR, the bit positions in the
mask with zeros will pass the original value through and bit positions in
the mask with ones will invert the original value. The example below
uses the mask 001011102 to toggle bits 1, 2, 3, and 5 of a binary value
while leaving the others untouched.
Original value 1 0 0 1 0 1 1 0
Mask 0 0 1 0 1 1 1 0
Bitwise XOR 1 0 1 1 1 0 0 0
Example
Assume a byte is used to control the warning and indicator lights on
an automotive dashboard. The following is a list of the bit positions and
the dashboard lights they control.
Determine the mask to be used with a bitwise XOR that when used
once a second will cause the left and right turn indicators to flash when
the emergency flashers are on.
Chapter 9: Binary Operation Applications 173
Solution
The bitwise XOR uses a mask with ones is in the positions to be
toggled and zeros in the positions to be left alone. To toggle bits 3 and
2 on and off, the mask should have ones only in those positions.
Therefore, the mask to be used with the bitwise XOR is 000011002.
detector 1
detector 2
Difference indicates
an error occurred
Signal A
Equals 1 when A≠B
Signal B
9.3 Parity
One of the most primitive forms of error detection is to add a single
bit called a parity bit to each piece of data to indicate whether the data
has an odd or even number of ones. It is considered a poor method of
error detection as it sometimes doesn't detect multiple errors. When
combined with other methods of error detection, however, it can
improve their overall performance.
There are two primary types of parity: odd and even. Even parity
means that the sum of the ones in the data element and the parity bit is
an even number. With odd parity, the sum of ones in the data element
and the parity bit is an odd number. When designing a digital system
that uses parity, the designers decide in advance which type of parity
they will be using.
Assume that a system uses even parity. If an error has occurred and
one of the bits in either the data element or the parity bit has been
inverted, then counting the number of ones results in an odd number.
From the information available, the digital system cannot determine
which bit was inverted or even if only one bit was inverted. It can only
tell that an error has occurred.
One of the primary problems with parity is that if two bits are
inverted, the parity bit appears to be correct, i.e., it indicates that the
data is error free. Parity can only detect an odd number of bit errors.
Some systems use a parity bit with each piece of data in memory. If
a parity error occurs, the computer will generate a non-maskable
interrupt, a condition where the operating system immediately
discontinues the execution of the questionable application.
Chapter 9: Binary Operation Applications 175
Example
Assume the table below represents bytes stored in memory along
with an associated parity bit. Which of the stored values are in error?
Data Parity
1 0 0 1 0 1 1 0 0
0 0 1 1 1 0 1 0 1
1 0 1 1 0 1 0 1 1
0 1 0 1 1 0 0 1 0
1 1 0 0 0 1 0 1 1
Solution
To determine which data/parity combinations have an error, count
the number of ones in each row. The rows with an odd sum have errors
while the rows with an even sum are assumed to contain valid data.
Data Parity
1 0 0 1 0 1 1 0 0 4 ones – even Æ no error
0 0 1 1 1 0 1 0 1 5 ones – odd Æ Error!
1 0 1 1 0 1 0 1 1 6 ones – even Æ no error
0 1 0 1 1 0 0 1 0 4 ones – even Æ no error
1 1 0 0 0 1 0 1 1 5 ones – odd Æ Error!
9.4 Checksum
For digital systems that store or transfer multiple pieces of data in
blocks, an additional data element is typically added to each block to
provide error detection for the block. This method of error detection is
common, especially for the transmission of data across networks.
One of the simplest implementations of this error detection scheme
is the checksum. As a device transmits data, it takes the sum of all of
the data elements it is transmitting to create an aggregate sum. This
sum is called the datasum. The overflow carries generated by the
additions are either discarded or added back into the datasum. The
transmitting device then sends a form of this datasum appended to the
end of the block. This new form of the datasum is called the checksum.
As the data elements are received, they are added a second time in
order to recreate the datasum. Once all of the data elements have been
received, the receiving device compares its calculated datasum with the
checksum sent by the transmitting device. The data is considered error
176 Computer Organization and Design Fundamentals
free if the receiving device's datasum compares favorably with the
transmitted checksum. Figure 9-7 presents a sample data block and the
datasums generated both by discarding the two carries and by adding
the carries to the datasum.
Datasum Datasum
(discarded (added
Data
carries) carries)
Upon receiving this transmission, the datasum for this data block
must be calculated. Begin by taking the sum of all the data elements.
As shown earlier, the basic checksum for the data block in Figure
9-7 is 5916 (010110012). The 1's complement checksum for the same
data block is equal to the 1's complement of 5916.
The 2's complement checksum for the data block is equal to the 2's
complement of 5916.
Example
Determine if the data block and accompanying checksum below are
error free. The data block uses a 1's complement checksum.
Data Checksum
0616 0016 F716 7E16 0116 5216 3116
Solution
First, calculate the datasum by adding all the data elements in the
data block.
CE16 = 110011102
1's complement of CE16 = 0011000012 = 3116
Example
Write a C program to determine the basic checksum, 1's complement
checksum, and 2's complement checksum for the data block 0716, 0116,
2016, 7416, 6516, 6416, 2E16.
Solution
Before we get started on this code, it is important to know how to
take a 1's complement and a 2's complement in C. The 1's complement
uses a bitwise not operator '~'. By placing a '~' in front of a variable or
constant, the bitwise inverse or 1's complement is returned. Since most
computers represent negative numbers with 2's complement notation,
the 2's complement is calculated by placing a negative sign in front of
the variable or constant.
The code below begins by calculating the datasum. It does this with
a loop that adds each value from the array of data values to a variable
labeled datasum. After each addition, any potential carry is stripped off
using a bitwise AND with 0xff. This returns the byte value.
Once the datasum is calculated, the three possible checksum values
can be calculated. The first one is equal to the datasum, the second is
equal to the bitwise inverse of the datasum, and the third is equal to the
2's complement of the datasum.
int datasum=0;
int block[] = {0x07, 0x01, 0x20, 0x74,
0x65, 0x64, 0x2E};
This is not a robust example due to the fact that 4 bits only have 16
possible bit patterns, but the result is clear. A single bit change in one
of the data elements resulted in a single bit change in the addition
result. The same change, however, resulted in three bits changing in the
division remainder.
The problem is that division in binary is not a quick operation. For
example, Figure 9-9 shows the long division in binary of 31,45210 =
01111010110111002 by 910 = 10012. The result is a quotient of
1101101001102 = 3,49410 with a remainder of 1102 = 610.
Remember that the goal is to create a checksum that can be used to
check for errors, not to come up with a mathematically correct result.
Keeping this in mind, the time it takes to perform a long division can be
reduced by removing the need for "borrows". This would be the same
as doing an addition while ignoring the carries. The truth table in Table
9-2 shows the single bit results for both addition and subtraction when
carries and borrows are ignored.
Chapter 9: Binary Operation Applications 181
110110100110
1001 0111101011011100
-1001
1100
-1001
1110
-1001
1011
-1001
1010
-1001
1111
-1001
1100
-1001
110
A B A+B A–B
0 0 0 0
0 1 1 1 (no borrow)
1 0 1 1
1 1 0 (no carry) 0
11011010 11011010
+01101100 -01101100
10110110 10110110
111010001010
1001 0111101011011100
-1001
1100
-1001
1011
-1001
1001
-1001
01011
-1001
1010
-1001
110
Example
Perform the long division of 11001101101010112 by 10112 in binary
using the borrow-less subtraction, i.e., XOR function.
Solution
Using the standard "long-division" procedure with the XOR
subtractions, we divide 10112 into 11001101101010112. Table 9-4
checks our result using the technique shown in Table 9-3. Since we
were able to recreate the original value from the quotient and
remainder, the division must have been successful.
Note that in Table 9-4 we are reconstructing the original value from
the quotient in order to demonstrate the application of the XOR in this
modified division and multiplication. This is not a part of the CRC
implementation. In reality, as long as the sending and receiving devices
use the same divisor, the only result of the division that is of concern is
the remainder. As long as the sending and receiving devices obtain the
same results, the transmission can be considered error free.
184 Computer Organization and Design Fundamentals
1110101001111
1011 1100110110101011
-1011
1111
-1011
1001
-1011
1001
-1011
1010
-1011
1101
-1011
1100
-1011
1111
-1011
1001
-1011
010
Solution
With a 5 bit divisor, append 5 – 1 = 4 zeros to the end of the data.
1100001010110010
11011 10110111100101100000
-11011
11011
-11011
011100
-11011
11110
-11011
10111
-11011
11000
-11011
11000
-11011
0110
The data stream sent to the receiving device becomes the original
data stream with the 4-bit remainder appended to it.
If the receiver divides the entire data stream by the same divisor
used by the transmitting device, i.e., 110112, the remainder will be zero.
This is shown in the following division. If this process is followed, the
receiving device will calculate a zero remainder any time there is no
error in the data stream.
Chapter 9: Binary Operation Applications 187
1100001010110010
11011 10110111100101100110
-11011
11011
-11011
011100
-11011
11110
-11011
10111
-11011
11000
-11011
11011
-11011
00
Table 9-5 Data Groupings and Parity for the Nibble 10112
In memory, the nibble would be stored with its parity bits in an eight-
bit location as 101101002.
Now assume that the bit in the D1 position which was originally a 1
is flipped to a 0 causing an error. The new value stored in memory
would be 100101002. Table 9-6 duplicates the groupings of Table
9-5 with the new value for D1. The table also identifies groups that
incur a parity error with the data change.
Note that parity is now in error for groups A, C, and D. Since the D1
position is the only bit that belongs to all three of these groups, then a
Chapter 9: Binary Operation Applications 191
processor checking for errors would not only know that an error had
occurred, but also in which bit it had occurred. Since each bit can only
take on one of two possible values, then we know that flipping the bit
D1 will return the nibble to its original data.
If an error occurs in a parity bit, i.e., if P3 is flipped, then only one
group will have an error. Therefore, when the processor checks the
parity of the four groups, a single group with an error indicates that it is
a parity bit that has been changed and the original data is still valid.
It turns out that not all four data groupings are needed. If we only
use groups A, B, and C, we still have the same level of error detection,
but we do it with one less parity bit. Continuing our example without
Group D, if our data is error-free or if a single bit error has occurred,
one of the following eight situations is true.
Circle A Circle B
P0 D1 P1
D3
D2 D0
Circle C P2
Figure 9-13a uses this arrangement to insert the nibble 10112 into a
Venn diagram. Figures 9-13b, c, and d show three of the seven possible
error conditions.
A B A B
0 1 1 1 1 1
1 1
0 1 0 1
0 0
C C
A B A B
0 1 1 0 1 1
1 0
1 1 0 1
0 0
C C
A B A B
0 1 1 0 1 0
1 1
0 1 0 0
0 0
C C
This can be done by adding one more bit that acts as a parity check
for all seven data and parity bits. Figure 9-15 represents this new bit
using the same example from Figure 9-14.
If a single-bit error occurs, then after we go through the process of
correcting the error, this new parity bit will be correct. If, however,
after we go through the process of correcting the error and the new
parity bit is in error, then it can be assumed that a double-bit error has
occurred and that correction is not possible. This is called Single-Error
Correction/Doubled-Error Detection.
194 Computer Organization and Design Fundamentals
A B A B
0 1 1 0 1 0
1 1
0 1 0 0
0 0
C 0 New parity C 0
bit
a.) Error-free condition b.) Two-Bit Error Condition
account for the condition where there are no errors. If 2p – 1 is less than
the number of data bits, n, plus the number of parity bits, p, then we
don't have enough parity bits. This relationship is represented with
equation 9-1.
p + n < 2p – 1 (9.1)
Table 9-9 presents a short list of the number of parity bits that are
required for a specific number of data bits. To detect double-bit errors,
an additional bit is needed to check the parity of all of the p + n bits.
Let's develop the error-checking scheme for 8 data bits. Remember
from the four-bit example that there were three parity checks:
• P0 was the parity bit for data bits for D1, D2, and D3;
• P1 was the parity bit for data bits for D0, D1, and D3; and
• P2 was the parity bit for data bits for D0, D2, and D3.
Chapter 9: Binary Operation Applications 195
Table 9-9 Parity Bits Required for a Specific Number of Data Bits
Number of Number of p + n 2p – 1
data bits (n) parity bits (p)
4 3 7 7
8 4 12 15
16 5 21 31
32 6 38 63
64 7 71 127
128 8 136 255
In order to check for a bit error, the sum of ones for each of these
groups is taken. If all three sums result in even values, then the data is
error-free. The implementation of a parity check is done with the XOR
function. Remember that the XOR function counts the number of ones
at the input and outputs a 1 for an odd count and a 0 for an even count.
This means that the three parity checks we use to verify our four data
bits can be performed using the XOR function. Equations 9.2, 9.3, and
9.4 show how these three parity checks can be done. The XOR is
represented here with the symbol ⊕.
The single parity bit error reveals itself as a single parity check
outputting a 1. If, however, a data bit changed, then we have more than
one parity check resulting in a 1. Assume, for example, that D1 changed
from a 1 to a 0.
Since D1 is the only bit that belongs to both the parity check of
groups A and B, then D1 must have been the one to have changed.
Using this information, we can go to the eight data bit example.
With four parity bits, we know that there will be four parity check
equations, each of which will have a parity bit that is unique to it.
The next step is to figure out which data bits, D0 through D7, belong
to which groups. Each data bit must have a unique membership pattern
so that if the bit changes, its parity check will result in a unique pattern
of parity check errors. Note that all of the data bits must belong to at
least two groups to avoid an error with that bit looking like an error
with the parity bit.
Table 9-10 shows one way to group the bits in the different parity
check equations or groups. It is not the only way to group them.
By using the grouping presented in Table 9-10, we can complete our
four parity check equations.
When it comes time to store the data, we will need 12 bits, eight for
the data and four for the parity bits. But how do we calculate the parity
bits? Remember that the parity check must always equal zero.
Therefore, the sum of the data bits of each parity group with the parity
bit must be an even number. Therefore, if the sum of the data bits by
themselves is an odd number, the parity bit must equal a 1, and if the
sum of the data bits by themselves is an even number, the parity bit
must equal a 0. This sounds just like the XOR function again.
Therefore, we use equations 9.9, 9.10, 9.11, and 9.12 to calculate the
parity bits before storing them.
P0 = D0 ⊕ D1 ⊕ D3 ⊕ D4 ⊕ D6 (9.9)
P1 = D0 ⊕ D2 ⊕ D3 ⊕ D5 ⊕ D6 (9.10)
P2 = D1 ⊕ D2 ⊕ D3 ⊕ D7 (9.11)
P3 = D4 ⊕ D5 ⊕ D6 ⊕ D7 (9.12)
Now let's test the system. Assume we need to store the data
100111002. This gives us the following values for our data bits:
D7 = 1 D6 = 0 D5 = 0 D4 = 1 D3 = 1 D2 = 1 D1 = 0 D0 = 0
198 Computer Organization and Design Fundamentals
The first step is to calculate our parity bits. Using equations 9.9,
9.10, 9.11, and 9.12 we get the following values.
P0 = 0 ⊕ 0 ⊕ 1 ⊕ 1 ⊕ 0 = 0
P1 = 0 ⊕ 1 ⊕ 1 ⊕ 0 ⊕ 0 = 0
P2 = 0 ⊕ 1 ⊕ 1 ⊕ 1 = 1
P3 = 1 ⊕ 0 ⊕ 0 ⊕ 1 = 0
Once again, the XOR is really just a parity check. Therefore, if there
is an odd number of ones, the result is 1 and if there is an even number
of ones, the result is 0.
Now that the parity bits have been calculated, the data and parity
bits can be stored together. This means that memory will contain the
following value:
D7 D6 D5 D4 D3 D2 D1 D0 P0 P1 P2 P3
1 0 0 1 1 1 0 0 0 0 1 0
If our data is error free, then when we read it and substitute the
values for the data and parity bits into our parity check equations, all
four results should equal zero.
Parity check A = 0 ⊕ 0 ⊕ 0 ⊕ 1 ⊕ 1 ⊕ 0 = 0
Parity check B = 0 ⊕ 0 ⊕ 1 ⊕ 1 ⊕ 0 ⊕ 0 = 0
Parity check C = 1 ⊕ 0 ⊕ 1 ⊕ 1 ⊕ 1 = 0
Parity check D = 0 ⊕ 1 ⊕ 0 ⊕ 0 ⊕ 1 = 0
If, however, while the data was stored in memory, it incurs a single-
bit error, e.g., bit D6 flips from a 0 to a 1, then we should be able to
detect it. If D6 does flip, the value shown below is what will be read
from memory, and until the processor checks the parity, we don't know
that anything is wrong with it.
D7 D6 D5 D4 D3 D2 D1 D0 P0 P1 P2 P3
1 1 0 1 1 1 0 0 0 0 1 0
Start by substituting the values for the data and parity bits read from
memory into our parity check equations. Computing the parity for all
four groups shows that an error has occurred.
Chapter 9: Binary Operation Applications 199
Parity check A = 0 ⊕ 0 ⊕ 0 ⊕ 1 ⊕ 1 ⊕ 1 = 1
Parity check B = 0 ⊕ 0 ⊕ 1 ⊕ 1 ⊕ 0 ⊕ 1 = 1
Parity check C = 1 ⊕ 0 ⊕ 1 ⊕ 1 ⊕ 1 = 0
Parity check D = 0 ⊕ 1 ⊕ 0 ⊕ 1 ⊕ 1 = 1
Since we see from Table 9-10 that the only bit that belongs to parity
check groups A, B, and D is D6, then we know that D6 has flipped and
we need to invert it to return to our original value.
The same problem appears here as it did in the nibble case if there
are two bit errors. It is solved here the same way as it was for the nibble
application. By adding a parity bit representing the parity of all twelve
data and parity bits, then if one of the group parities is wrong but the
overall parity is correct, we know that a double-bit error has occurred
and correction is not possible.
Problems
1. Using an original value of 110000112 and a mask of 000011112,
calculate the results of a bitwise AND, a bitwise OR, and a bitwise
XOR for these values.
2. Assume that the indicators of an automotive dashboard are
controlled by an 8-bit binary value named dash_lights. The table
below describes the function of each bit. Assume that a '1' turns on
the light corresponding to that bit position and a '0' turns it off.
D0 Low fuel D4 Left turn signal
D1 Oil pressure D5 Right turn signal
D2 High temperature D6 Brake light
D3 Check engine D7 Door open
For each of the following situations, write the line of code that uses
a bitwise operation to get the desired outcome.
200 Computer Organization and Design Fundamentals
a.) Turn on the low fuel, oil pressure, high temperature, check
engine, and brake lights without affecting any other lights. This
would be done when the ignition key is turned to start.
b.) Toggle both the right and left turn signals as if the flashers
were on without affecting any other lights.
c.) Turn off the door open light when the door is closed.
18. Identify the error in the parity check equations below. Note that the
expressions are supposed to represent a different grouping than
those in equations 9.2, 9.3, and 9.4. There is still an error though
with these new groupings.
Parity check for group A = P0 ⊕ D0 ⊕ D2 ⊕ D3
Parity check for group B = P1 ⊕ D0 ⊕ D1
Parity check for group C = P2 ⊕ D1 ⊕ D2
CHAPTER TEN
Memory Cells
10.1.1 Edges/Transitions
Many devices use as their input a change in a signal rather than a
level of a signal. For example, when you press the "on" button to a
computer, it isn't a binary one or zero from the switch that turns on the
computer. If this was the case, as soon as you removed your finger, the
machine would power off. Instead, the computer begins its power up
sequence the instant your finger presses the button, i.e., the button
transitions from an off state to an on state.
There are two truth table symbols that represent transitions from one
logic value to another. The first represents a change in a binary signal
from a zero to a one, i.e., a transition from a low to a high. This
transition is called a rising edge and it is represented by the symbol ↑.
The second symbol represents a transition from a one to a zero. This
transition is called a falling edge and it is represented by the symbol ↓.
203
204 Computer Organization and Design Fundamentals
Figure 10-1 presents a binary signal with the points where transitions
occur identified with these two new symbols.
↑ ↓ ↑ ↓ ↑ ↓
Figure 10-1 Symbols for Rising Edge and Falling Edge Transitions
The problem with the circuit of Figure 10-3 is that there is no way to
modify the value that is stored. We need to replace either one or both of
the inverters with a device that has more than one input, but one that
can also operate the same way as the inverter during periods when we
want the data to be stable. It turns out that the NAND gate can do this.
206 Computer Organization and Design Fundamentals
Figure 10-4 presents the truth table for the NAND gate where one of
the inputs is always connected to a one.
A A 1 X
X 0 1 1
1
1 1 0
Figure 10-4 Operation of a NAND Gate with One Input Tied High
Notice that the output X is always the inverse of the input A. The
NAND gate operates just like an inverter with a second input. Figure
10-5 replaces the inverters of Figure 10-3 with NAND gates.
As long as the free inputs to the two NAND gates remain equal to
one, the circuit will remain stable since it is acting as a pair of inverters
connected together in series. It is also important to note that if the top
inverter outputs a zero, the bottom inverter outputs a one. Likewise, if a
one is output from the top inverter, then a zero is output from the
bottom one. These two possible states are shown in Figure 10-6.
1 0 1 1
1 0
1 1
0 1 0 1
1 0
1 1
a.) A zero to the free input b.) That one passes to the
of the top NAND gate bottom NAND which in
forces a one to its output turn outputs a zero
0 1 1 1
0 0
1 0 1 0
1 1
c.) A zero from the bottom d.) The second zero at the top
NAND returns to the lower NAND holds its output even
input of the top NAND if the free input returns to 1
This means that the circuit can be used to store a one in the top
NAND gate and a zero in the bottom NAND gate by toggling the free
208 Computer Organization and Design Fundamentals
input on the top NAND gate from a one to a zero and back to a one.
Figure 10-8 shows what happens when we toggle the free input on the
bottom NAND gate from a one to a zero and back to a one.
1 1 0
1
1 1
0 0
a.) A zero to the free input of b.) That one passes to the
the bottom NAND gate top NAND which in turn
forces a one to its output outputs a zero
1 0 1 0
1 1
0 0
1 1
0 1
c.) A zero from the top NAND d.) The second zero at the bottom
returns to the lower input of NAND holds its output even
the bottom NAND if the free input returns to 1
This NAND gate circuit represents the basic circuit used to store a
single bit using logic gates. Notice that in step d of both figures the
circuit is stable with the opposing NAND gates outputting values that
are inverses of each other. In addition, notice that the circuit's output is
changed by placing a zero on the free input of one of the NAND gates.
Figure 10-9 presents the standard form of this circuit with the inputs
labeled S and R and the outputs labeled Q and Q . The bars placed
over the inputs indicate that they are active low inputs while the bar
over one of the outputs indicates that it is an inverted value of Q.
This circuit is referred to as the S-R latch. The output Q is set to a
one if the S input goes low while R stays high. The output Q is reset
to a zero if the R input goes low while S stays high. If both of these
Chapter 10: Memory Cells 209
inputs are high, i.e., logic one, then the circuit maintains the current
value of Q. The truth table for the S-R latch is shown in Figure 10-10.
S Q S R Q Q
0 0 U U
0 1 1 0
1 0 0 1
R Q 1 1 Q0 Q0
Figure 10-9 S-R Latch Figure 10-10 S-R Latch Truth Table
Notice that the row of the truth table where both inputs equal zero
produces an undefined output. Actually, the output is defined: both Q
and its inverse are equal to one. What makes this case undefined is that
when both of the inputs return to one, the output of the system becomes
unpredictable, and possibly unstable. It is for this reason that the top
row of this truth table is considered illegal and is to be avoided for any
implementation of the S-R latch circuit.
S
D Q
Clock Q
R
D Clock Q Q D Clock Q Q
X 0 Q0 Q0 X 0 Q0 Q0
X 1 Q0 Q0 X 1 Q0 Q0
X ↓ Q0 Q0 X ↑ Q0 Q0
0 ↑ 0 1 0 ↓ 0 1
1 ↑ 1 0 1 ↓ 1 0
Notice that the value on D does not affect the output if the Clock
input is stable, nor does it have an effect during the clock transition
other than the one for which it was defined. During these periods, the
values stored at the latch's outputs remain set to the values stored there
from a previous data capture.
D latches can also be designed to capture data during a specified
level on the Clock signal rather than a transition. These are called
transparent latches. They latch data much like an edge triggered latch,
but while Clock is at the logic level previous to the transition, they pass
all data directly from the D input to the Q output. For example, when a
zero is input to the Clock input of a D latch designed to capture data
when Clock equals zero, the latch appears to vanish, passing the signal
D straight to Q. The last value present on D when the Clock switches
from zero to one is stored on the output until Clock goes back to zero.
Figure 10-13 presents this behavior using truth tables for both the
active low and active high transparent D latches.
D Clock Q Q D Clock Q Q
X 1 Q0 Q0 X 0 Q0 Q0
0 0 0 1 0 1 0 1
1 0 1 0 1 1 1 0
clock input
D Q D Q D Q D Q
Clock
Q Q Q Q
input
10.5 Counter
By making a slight modification to the cascaded divide-by-two
circuits of Figure 10-16, we can create a circuit with a new purpose.
Figure 10-17 shows the modified circuit created by using the inverted
outputs of the latches to drive the Clock inputs of the subsequent
latches instead of using the Q outputs to drive them.
D Q D Q D Q D Q
Clock
input Q Q Q Q
If we draw the outputs of all four latches with respect to each other
for this new circuit, we see that the resulting ones and zeros from their
outputs have a familiar pattern to them, specifically, they are counting
in binary.
If the leftmost latch is considered the LSB of a four-bit binary
number and the rightmost latch is considered the MSB, then a cycle on
the input clock of the leftmost latch will increment the binary number
by one. This means that by connecting the inverted output of a divide-
by-two circuit to the clock input of a subsequent divide-by-two circuit n
times, we can create an n-bit binary counter that counts the pulses on an
incoming frequency.
214 Computer Organization and Design Fundamentals
Input
clock
Latch A
Latch B
Latch C
Latch D
B O 0 1 0 1 0 1 0 1 0 1 0 1 0
i u
n t 0 0 1 1 0 0 1 1 0 0 1 1 0
a p 0 0 0 0 1 1 1 1 0 0 0 0 1
r u
y t 0 0 0 0 0 0 0 0 1 1 1 1 1
D Q
D Q
Inputs D Q Outputs to
for data external
D Q
bits device
D Q
D Q
D Q
"Write" line
1 Q
1. For the circuit shown to the right,
what value does Q have?
2. Describe why the S-R latch Q
has an illegal condition. 0
F Q F Q
c.) d.)
F D Q F D Q
Q Q
System
inputs
Output
Combinational
D Latch Logic
Latches storing
current state D Latch
D Latch
11.1.1 States
So what is a state? A state defines the current condition of a system.
It was suggested at the end of Chapter 10 that a traffic signal system is
a state machine. The most basic traffic signal controls an intersection
with two directions, North-South and East-West for example. There are
217
218 Computer Organization and Design Fundamentals
certain combinations of lights (on or off) that describe the intersection's
condition. These are the system's states.
Switch goes on
OFF ON
Figure 11-5 Complete State Diagram for Light Bulb State Machine
The upper half of each circle indicates the name of the state. The
lower half indicates the binary output associated with that state. In the
case of the light bulb state machine, a zero is output while we are in the
OFF state and a one is output while we are in the ON state. The arrows
along with the input value say that when we are in state OFF and the
switch input goes to a 1, move to state ON. When we are in state ON
and the switch input goes to a 0, move to state OFF.
Before creating a state diagram, we must define the parameters of
the system. We begin with the inputs and outputs. The inputs are vital
to the design of the state diagram as their values will be used to dictate
state changes. As for the outputs, their values will be determined for
each state as we create them.
It is also important to have an idea of what will define our states. For
example, what happens to the states of our traffic signal system if we
add a crosswalk signal? It turns out that the number of states we have
will increase because there will be a difference between the state when
the crosswalk indicator says "Walk" and when it flashes "Don't Walk"
just before the traffic with the green light gets its yellow.
Here we will introduce an example to illustrate the use of state
diagrams. Chapter 10 presented a simple counter circuit that
incremented a binary value each time a pulse was detected. What if we
wanted to have the option to decrement too? Let's design a state
machine that stores a binary value and has as its input a control that
determines whether we are incrementing that binary value or
decrementing it when a pulse is received.
220 Computer Organization and Design Fundamentals
110 011
D=1 D=1
D=1
D=0 101 100 D=0
D=0
Figure 11-7 State Diagram for a 3-Bit Up-Down Binary Counter
In Figure 11-7, the arrows going clockwise around the inside of the
diagram represent the progression through the states at each clock pulse
when direction equals 1. Notice that each pulse from clock should take
us to the next highest three-bit value. The arrows going counter-
clockwise around the outside of the diagram represent the progression
through the states at each clock pulse when direction equals zero.
There is an additional detail that must be represented with a state
diagram. When a system first powers up, it should be initialized to a
reset state. We need to indicate on the diagram which state is defined as
the initial state. For example, the up-down counter may be initialized to
the state 0002 when it is first powered up. The state diagram represents
this by drawing an arrow to the initial state with the word "reset"
printed next to it. A portion of Figure 11-7 is reproduced in Figure 11-8
showing the reset condition.
Reset D=0
• Any state other than an initial state that has no transitions going
into it should be removed since it is impossible to reach that state.
• For a system with n inputs, there should be exactly 2n transitions
coming out of every state, one for each pattern of ones and zeros
for the n inputs. Some transitions may come back to the current
state, but every input must be accounted for. Missing transitions
should be added while duplicates should be removed.
The following example shows how some of these errors might appear.
Example
Identify the errors in the following state diagram.
P=0
D
P=1 0
Reset P=0
A B
0 0 P=0
P=1
P=1 P=0
C P=1 E
P=0 0
0
P=0
Solution
Error 1 – There is no way to get to state E. It should be removed.
Although state A has no transitions to it, it is not a problem because it
is the initial state.
Error 2 – The transition from state D for P=0 is defined twice while the
transition for P=1 is never defined.
External Outputs
inputs Logic Latches Logic deter-
deter- outputting mining
mining the current the output
the next state from current
state state
Clock
Example
The block diagram below represents a state machine. Answer the
following questions based on the state machine's components and the
digital values present on each of the connections.
1 0
0 D Q
0
S2
1 Next Output
state 0 1 logic
D Q
logic S1
0 1
D Q
Clock S0
Chapter 11: State Machines 225
Solution
What is the maximum number of states this system could have?
Since the system has 3 latches, then the numbers 0002, 0012, 0102,
0112, 1002, 1012, 1102, and 1112 can be stored. Therefore, this state
machine can have up to eight states.
How many rows are in the truth table defining the output? Since the
output is based on the current state which is represented by the latches,
and since there are three latches, the logic circuit for the output has
three inputs. With three inputs, there are 23 = 8 possible patterns of
ones and zeros into the circuit, and hence, 8 rows in the truth table.
How many rows are in the truth table defining the next state? Since
the next state of the state machine, i.e., the value on the input lines to
the latches, depends on the current state fed back into the next state
logic and the system inputs, then there are five inputs that determine the
next state. Therefore, the inputs to the next state logic have 25 = 32
possible patterns of ones and zeros. This means that the next state logic
truth table has 32 rows.
What is the current state of this system? The current state equals the
binary value stored in the latches. Remembering that S0 is the LSB
while S2 is the MSB, this means that the current state is 0112 = 310.
If the clock were to pulse right now, what would the next state be?
The next state is the binary value that is present at the D inputs to the
latches. Once again, S0 is the LSB and S2 is the MSB. Therefore, the
next state is 1002 = 410.
Reset
0
off
The fact that we selected an initial state with the light bulb off might
be clear, but it might not be clear why we added the condition that the
user's finger is not touching the button. As we go through the design,
we will see how the transitions between states depend on whether the
button is currently pressed or released. This means that the condition of
the button directly affects the state of the system.
So where do we go from this initial state? Well, when a clock pulse
occurs, the decision of which state to go to from state 0 depends on the
inputs to the system, namely whether the button is pressed (B=1) or
released (B=0). Since the button has two possible conditions, then there
will be two possible transitions out of state 0. Figure 11-11 shows how
these transitions exit state 0 as arrows.
Each of these transitions must pass to a state, so the next step is to
determine which state each transition goes to. To do this, we either
need to create a new state or have the transition return to state 0.
Reset B=0
0
off
B=1
If B=0, the button is not pressed and the light should stay off. We
need to pass to a state that represents the condition that the light is off
and the button is released. It just so happens that this is the same as the
initial state, so the transition for B=0 should just return to state 0.
Chapter 11: State Machines 227
Reset B=0
0
off
B=1
When the button is pressed, the light should come on. Therefore, the
transition for B=1 should pass to a state where the light is on and the
button is pressed. We don't have this state in our diagram, so we need
to add it.
Reset B=0
0
off 1
B=1
on
Reset B=0
0
off 1
B=1
on
2
on B=0
Reset B=0
0 B=1
off 1
B=1
on
B=0
2
on
Now that all of the transitions from state 1 have been defined, we
need to begin defining the transitions from state 2. If B=0, the button
has not been pressed and the current state must be maintained. If the
button is pressed, the light is supposed to turn off. Therefore, we need
to pass to a state where the light is off and the button is pressed. This
state doesn't exist, so we need to create state 3.
Reset B=0
0 B=1
off 1
B=1
on
3
B=1 2 B=0
off
on B=0
As you can see, each time we create a new state, we need to add the
transitions for both B=0 and B=1 to it. This will continue until the
addition of all the transitions does not create any new states. The last
Chapter 11: State Machines 229
step added state 3 so we need to add the transitions for it. If B=0, then
the button has been released, and we need to move to a state where the
button is released and the light bulb is off. This is state 0. If B=1, then
the button is still pressed and the bulb should remain off. This is state 3.
Since we didn't create any new states, then the state diagram in Figure
11-17 should be the final state diagram for the system.
Reset B=0
0 B=1
off 1
B=0 B=1
on
B=1
3
B=1 B=0
off 2
on B=0
At this point, there are a couple of items to note. First, as each state
was created, it was assigned a number beginning with state 0 for the
initial state. The order in which the states are numbered is not important
right now. Advanced topics in state machine design examine how the
numbering affects the performance of the circuit, but this chapter will
not address this issue. It is a good idea not to skip values as doing this
may add latches to your design.
The second item to note regards the operation of the state machine.
The state diagram shows that to know which state we are going to be
transitioning to, we need to know both the current state and the current
values on the inputs.
The next step is a minor one, but it is necessary in order to
determine the number of latches that will be used in the center block of
Figure 11-9. Remember that the latches maintain the current state of the
state machine. Each latch acts as a bit for the binary value of the state.
For example, if the current state of the system is 210 = 102, then the
state machine must have at least two latches, one to hold the '1' and one
to hold the '0'. By examining the largest state number, we can
230 Computer Organization and Design Fundamentals
determine the minimum number of bits it will take to store the current
state. This is why we begin numbering out states at zero.
For our system, the largest state number is 310 = 112. Since 3 takes
two bits to represent, then two latches will be needed to store any of the
states the system could enter. Table 11-1 presents each of the states
along with their numeric value in decimal and binary.
Numeric Value
State
Decimal Binary
Bulb off; button released 0 00
Bulb on; button pressed 1 01
Bulb on; button released 2 10
Bulb off; button pressed 3 11
We will label the two bits used to represent these values S1 and S0
where S1 represents the MSB and S0 represents the LSB. This means,
for example, that when S1 = 0 and S0 = 1, the bulb is on and the button
is pressed. Each of these bits requires a latch. Using this information,
we can begin building the hardware for our state machine.
S1 '
D Q L
B Next S1 Output
state S0 ' logic
logic D Q
S0
Clock
The next step is to develop the truth tables that will be used to create
the two blocks of logic on either side of the latches in Figure 11-18. We
begin with the "next state logic." The inputs to this logic will be the
system input, B, and the current state, S1 and S0. The outputs represent
the next state that will be loaded into the latches from their D inputs
Chapter 11: State Machines 231
when a clock pulse occurs. These are represented in Figure 11-18 by
the signals S1' and S0'.
The next state truth table lists every possible combination of ones
and zeros for the inputs which means that every possible state along
with every possible system input will be listed. Each one of these rows
represents an arrow or a transition on the state diagram. The output
columns show the state that the system will be going to if a clock pulse
occurs. For example, if the current state of our push button circuit is
state 0 (S1 = 0 and S0 = 0) and the input B equals one, then we are
going to state 1 (S1' = 0 and S0' = 1). If the current state is state 0 and
the input B equals zero, then we are staying in state 0 (S1' = 0 and S0' =
0). Table 11-2 presents the truth table where each transition of the state
diagram in Figure 11-17 has been translated to a row.
We also need to create a truth table for the output logic block of
Figure 11-18. The output logic produces the correct output based on the
current state. This means that the circuit will take as its inputs S1 and S0
and produce the system output, L. The truth table is created by looking
at the output (the lower half of each circle representing a state), and
placing it in the appropriate row of a truth table based on the values of
S1 and S0. Table 11-3 presents the output truth table.
Table 11-2 Next State Truth Table for Push Button Circuit
S1 S0 B S1 ' S0 '
0 0 0 0 0 Å State 0 stays in state 0 when B=0
0 0 1 0 1 Å State 0 goes to state 1 when B=1
0 1 0 1 0 Å State 1 goes to state 2 when B=0
0 1 1 0 1 Å State 1 stays in state 1 when B=1
1 0 0 1 0 Å State 2 stays in state 2 when B=0
1 0 1 1 1 Å State 2 goes to state 3 when B=1
1 1 0 0 0 Å State 3 goes to state 0 when B=0
1 1 1 1 1 Å State 3 stays in state 3 when B=1
S1 S0 L
0 0 0 Å State 0: bulb is off
0 1 1 Å State 1: bulb is on
1 0 1 Å State 2: bulb is on
1 1 0 Å State 3: bulb is off
232 Computer Organization and Design Fundamentals
Now that we have our system fully defined using truth tables, we
can design the minimum SOP logic using Karnaugh maps. Figure
11-19 presents the Karnaugh maps for the outputs S1', S0', and L.
S1' S0' L
B B S1
S1S0 0 1 S1S0 0 1 S0 0 1
00 0 0 00 0 1 0 0 1
01 1 0 01 0 1 1 1 0
11 0 1 11 0 1
10 1 1 10 0 1
Figure 11-19 K-Maps for S1', S0', and L of Push Button Circuit
S0 ' = B
_ _
L = S1·S0 + S1·S0 = S1 ⊕ S0
B S1 '
D Q
S1 L
S0 '
D Q
S0
Clock
Numeric Value
State
Decimal Binary
Bulb off; button released 0 00
Bulb on; button pressed 1 01
Bulb on; button released 3 11
Bulb off; button pressed 2 10
This modification affects all of the logic, but let's only look at how it
affects the output logic that drives the signal L. In this case, the light is
on in states 1 and 3, but off for states 0 and 2. Figure 11-21 presents the
new output truth table and the resulting Karnaugh map.
S1 S0 L
0 0 0 S1
0 1 1 S0 0 1
1 0 0 0 0 1
1 1 1 1 0 1
Figure 11-21 Revised Truth Table and K Map for Push Button Circuit
L = S1
1101001111011001011101010000111101101111001
If a clock can be produced that pulses once for each incoming bit,
then we can develop a state machine that detects this pattern. The state
machine will initially output a zero indicating no pattern match and will
continue to output this zero until the full pattern is received. When the
full pattern is detected, the state machine will output a 1 for one clock
cycle.
The state machine used to detect the bit pattern "101" will have four
states, each state representing the number of bits that we have received
up to this point that match the pattern: 0, 1, 2, or 3. For example, a
string of zeros would indicate that we haven't received any bits for our
sequence. The state machine should remain in the state indicating no
bits have been received.
If, however, a 1 is received, then it is possible we have received the
first bit of the sequence "101". The state machine should move to the
state indicating that we might have the first bit. If we receive another 1
while we are in this new state, then we know that the first 1 was not
part of the pattern for which we are watching. The second 1, however,
might indicate the beginning of the pattern, so we should remain in the
state indicating that we might have received the first bit of the pattern.
This thought process is repeated for each state.
The list below identifies each of the states along with the states they
would transition to based on the input conditions.
I=1 I=0
Initial
state 1 I=0 2
0 I=1
digit digits
digits 0 0
0
I=1 I=0
I=0
3
I=1
digits
1
Figure 11-23 State Diagram for Identifying the Bit Pattern "101"
236 Computer Organization and Design Fundamentals
Next, we need to assign binary values to each of the states so that we
know how many latches will be needed to store the current state and
provide the inputs to the next state logic and the output logic. Table 11-
5 presents the list of states along with their decimal and binary values.
From the state diagram and the numbering of the states, we can
create the next state truth table and the output truth table. These are
presented in Figure 11-24 with S1 representing the MSB of the state, S0
representing the LSB of the state, and P representing the output.
Numeric Value
State
Decimal Binary
No bits of the pattern have been received 0 00
One bit of the pattern has been received 1 01
Two bits of the pattern have been received 2 10
Three bits of the pattern have been received 3 11
Figure 11-24 Next State and Output Truth Tables for Pattern Detect
Figure 11-25 K-Maps for S1', S0', and P of Pattern Detect Circuit
I S1 '
D Q
S1 L
S0 '
D Q
Clock S0
0/0
A B
1/1
0/1 1/1 1/0
1/0
C 0/0
D
0/1
Figure 11-28 Sample State Diagram of a Mealy Machine
The next state truth table for the Mealy machine is the same as that
for the Moore machine: the current state and the system input govern
the next state. The Mealy machine's output truth table is different,
however, since it now uses the system input as one of the truth table's
inputs. Figure 11-29 presents the output truth table for the state diagram
in Figure 11-28 where state A is S0 = 0, S1 = 0, B is S0 = 0, S1 = 1, C is
S0 = 1, S1 = 0, and D is S0 = 1, S1 = 1.
Problems
1. What is the maximum number of states a state machine with four
latches can have?
2. How many latches will a state machine with 28 states require?
3. Apply the design process presented in Section 11.2 to design a
two-bit up/down counter using the input direction such that when
direction = 0, the system decrements (00 Æ 11 Æ 10 Æ 01 Æ 00)
and when direction = 1, the system increments (00 Æ 01 Æ 10 Æ
11 Æ 00).
4. The three Boolean expressions below represent the next state bits,
S1' and S0', and the output bit, X, based on the current state, S1 and
S0, and the input I. Draw the logic circuit for the state machine
including the latches and output circuitry. Label all signals.
_ _
S1' = S1·S0 S0' = S1·S0·I X = S 1 + S0
5. Create the next state truth table and the output truth table for the
following state diagrams. Use the variable names S1 and S0 to
represent the most significant and least significant bits respectively
of the binary number identifying the state.
240 Computer Organization and Design Fundamentals
b.) P=1 00
P=0
0
P=1
P=0
01 10
P=0 1 1
P=1
241
242 Computer Organization and Design Fundamentals
The diagonal wires, called sense wires, were used to read data. They
could detect when the polarity on one of the rings was changed. To
read data, therefore, the bit in question would be written to with the
horizontal and vertical wires. If the sense wire detected a change in
polarity, the bit that had been stored there must have been opposite
from the one just written. If no polarity change was detected, the bit
written must have been equal to the one stored in that ring.
Magnetic core memory looks almost like fabric, the visible rings
nestled among a lacework of glistening copper wires. It is for these
reasons, however, that it is also impractical. Since the rings are
enormous relative to the scale of electronics, a memory of 1024 bytes
(referred to as a 1K x 8 or "1K by 8") had physical dimensions of
approximately 8 inches by 8 inches. In addition, the fine copper wires
were very fragile making manufacturing a difficult process. A typical
1K x 8 memory would cost thousands of dollars. Therefore, magnetic
core memory disappeared from use with the advent of transistors and
memory circuits such as the latch presented in Chapter 10.
Address
Chip
lines
select
A D
d e Write
d c
enable
r o
e d
s e Read
s r enable
Data lines
12.3.1 Buses
In order to communicate with memory, a processor needs three
types of connections: data, address, and control. The data lines are the
electrical connections used to send data to or receive data from
Chapter 12: Memory Organization 245
memory. There is an individual connection or wire for each bit of data.
For example, if the memory of a particular system has 8 latches per
memory location, i.e., 8 columns in the memory array, then it can store
8-bit data and has 8 individual wires with which to transfer data.
The address lines are controlled by the processor and are used to
specify which memory location the processor wishes to communicate
with. The address is an unsigned binary integer that identifies a unique
location where data elements are to be stored or retrieved. Since this
unique location could be in any one of the memory devices, the address
lines are also used to specify which memory device is enabled.
The control lines consist of the signals that manage the transfer of
data. At a minimum, they specify the timing and direction of the data
transfer. The processor also controls this group of lines. Figure 12-3
presents the simplest connection of a single memory device to a
processor with n data lines and m address lines.
Unfortunately, the configuration of Figure 12-3 only works with
systems that have a single memory device. This is not very common.
For example, a processor may interface with a BIOS stored in a non-
volatile memory while its programs and data are stored in the volatile
memory of a RAM stick. In addition, it may use the bus to
communicate with devices such as the hard drive or video card. All of
these devices share the data, address, and control lines of the bus.
(BIOS stands for Basic Input/Output System and it is the low-level
code used to start the processor when it is first powered up.)
DATA
Micro-
ADDRESS
processor
CONTROL
The numbers along the left side of the memory map represent the
addresses corresponding to each memory resource. The memory map
should represent the full address range of the processor. This full
address range is referred to as the processor's memory space, and its
size is represented by the number of memory locations in the full range,
i.e., 2m where m equals the number of address lines coming out of the
processor. It is up to the designer whether the addresses go in ascending
or descending order on the memory map.
As an example, let's calculate the memory space of the processor
represented by the memory map in Figure 12-6b. The top address for
this memory map is FFFFF16 = 1111 1111 1111 1111 11112. Since the
processor accesses its highest address by setting all of its address lines
to 1, we know that this particular processor has 20 address lines.
Therefore, its memory space is 220 = 1,048,57610 = 1 Meg. This means
that all of the memory resources for this processor must be able to fit
into 1 Meg without overlapping.
In the next section, we will see how to compute the size of each
partition of memory using the address lines. For now, however, we can
determine the size of a partition in memory by subtracting the low
250 Computer Organization and Design Fundamentals
address from the high address, then adding one to account for the fact
that the low address itself is a memory location. For example, the range
of the BIOS in Figure 12-6a starts at FF0016 = 65,28010 and goes up to
FFFF16 = 65,53510. This means that the BIOS fits into
65,535 – 65,280 +1 = 256 memory locations.
It is vital to note that there is an exact method to selecting the upper
and lower addresses for each of the ranges in the memory map. Take
for example the memory range for Program A in Figure 12-6b. The
lower address is 2000016 while the upper address is 27FFF16. If we
convert these addresses to binary, we should see a relationship.
It is not a coincidence that the upper five bits of these two addresses
are identical while the remaining bits go from all zeros in the low
address to all ones in the high address. Converting the high and the low
address of any one of the address ranges in Figure 12-6 should reveal
the same characteristic.
The next section shows how these most significant address bits are
used to define which memory device is being selected.
The division of the full address into two groups is done by dividing
the full address into a group of most significant bits and least
significant bits. The block diagram of an m-bit full address in Figure
12-7 shows how this is done. Each bit of the full address is represented
with an where n is the bit position.
Figure 12-7 Full Address with Enable Bits and Device Address Bits
The bits used to enable the memory device are always the most
significant bits while the bits used to access a memory location within
the device are always the least significant bits.
252 Computer Organization and Design Fundamentals
Example
A processor with a 256 Meg address space is using the address
35E3C0316 to access a 16 Meg memory device.
• How many address lines are used to define when the 16 Meg
memory space is enabled?
• What is the bit pattern of these enable bits that enables this
particular 16 Meg memory device?
• What is the address within the 16 Meg memory device that this
address is going to transfer data to or from?
• What is the lowest address in the memory map of the 16 Meg
memory device?
• What is the highest address in the memory map of the 16 Meg
memory device?
Solution
First, we need to determine where the division in the full address is
so that we know which bits go to the enable circuitry and which are
connected directly to the memory device's address lines. From Table
12-2, we see that to access 256 Meg, we need 28 address lines.
Therefore, the processor must have 28 address lines coming out of it.
The memory device is only 16 Meg which means that it requires 24
address lines to uniquely identify all of its addresses.
Therefore, the four most significant address lines are used to enable
the memory device.
By converting 35E3C0316 to binary, we should see the values of
each of these bit positions for this memory location in this memory
device.
The four most significant bits of this 28-bit address are 00112. This,
therefore, is the bit pattern that will enable this particular 16 Meg
memory device: a27 = 0, a26 = 0, a25 = 1, and a24 = 1. Any other pattern
Chapter 12: Memory Organization 253
of bits for these four lines will disable this memory device and disallow
any data transactions between it and the processor.
The 16 Meg memory device never sees the most significant four bits
of this full address. The only address lines it ever sees are the 24 that
are connected directly to its address lines: a0 through a23. Therefore, the
address the memory device sees is:
As for the highest and lowest values of the full address for this
memory device, we need to examine what the memory device interprets
as its highest and lowest addresses. The lowest address occurs when all
of the address lines to the memory device are set to 0. The highest
address occurs when all of the address lines to the memory device are
set to 1. Note that this does not include the four most significant bits of
the full address which should stay the same in order for the memory
device to be active. Therefore, from the standpoint of the memory map
which uses the full address, the lowest address is the four enable bits
set to 00112 followed by 24 zeros. The highest address is the four
enable bits set to 00112 followed by 24 ones.
Therefore, from the perspective of the memory map, the lowest and
highest addresses of this memory device are:
400000016
3FFFFFF16
16 Meg
memory
300000016
2FFFFFF16
000000016
Å 32-bit IP address Æ
Network address Host or local address
Bits used to Bits used to identify
identify subnet host within subnet
Figure 12-8 IPv4 Address Divided into Subnet and Host IDs
Example
The IPv4 address 202.54.151.45 belongs to a Class C network. What
are the subnet and the host ids of this address?
Solution
First, IPv4 addresses are represented as four bytes represented in
decimal notation. Therefore, let's convert the IP address above into its
32-bit binary equivalent.
20210 = 110010102
5410 = 001101102
15110 = 100101112
4510 = 001011012
Remember that the Class C network uses the first twenty-four bits
for the subnet id. This gives us the following value for the subnet id.
Any IPv4 address with the first 24 bits equal to this identifies a host in
this subnet.
The host id is taken from the remaining bits.
a27
a26
a25
a24
• Using the memory space of the processor and the size of the
memory device, determine the number of bits of the full address
that will be used for the chip select.
• Using the base address where the memory device is to be located,
determine the values that the address lines used for the chip select
are to have.
• Create a circuit with the address lines for the chip select going into
the inputs of a NAND gate with the bits that are to be zero inverted.
Example
Using logic gates, design an active low chip select for a 1 Meg
BIOS to be placed in the 1 Gig memory space of a processor. The
BIOS needs to have a starting address of 1E0000016.
Solution
First of all, let's determine how many bits are required by the 1 Meg
BIOS. We see from Table 12-2 that a 1 Meg memory device requires
Chapter 12: Memory Organization 257
20 bits for addressing. This means that the lower 20 address lines
coming from the processor must be connected to the BIOS address
lines. Since a 1 Gig memory space has 30 address lines (230 = 1 Gig),
then 30 – 20 = 10 address lines are left to determine the chip select.
Next, we figure out what the values of those ten lines are supposed
to be. If we convert the starting address to binary, we get:
Notice that enough leading zeros were added to make the address 30
bits long, the appropriate length in a 1 Gig memory space.
We need to assign each bit a label. We do this by labeling the least
significant bit a0, then incrementing the subscript for each subsequent
position to the left. This gives us the following values for each address
bit. (a18 through a2 have been deleted in the interest of space.)
a29 a28 a27 a26 a25 a24 a23 a22 a21 a20 a19 a18 … a1 a0
0 0 0 0 0 1 1 1 1 0 0 0 … 0 0
Solution
This may seem like a rather odd question, but it actually deals with
an important aspect of creating chip selects. Notice that for every one
of our starting addresses, the bits that go to the chip select circuitry can
be ones or zeros. The bits that go to the address lines of the memory
device, however, must all be zero. This is because the first address in
any memory device is 010. The ending or highest address will have all
ones going to the address lines of the memory device.
258 Computer Organization and Design Fundamentals
Let's begin by converting the address A4000016 to binary.
If we count the zeros starting with the least significant bit and
moving left, we see that there are 18 zeros before we get to our first
one. This means that the largest memory device we can place at this
starting address has 18 address lines. Therefore, the largest memory
device we can start at this address has 218 = 256 K memory locations.
Example
True or False: B00016 to CFFF16 is a valid range for a single
memory device.
Solution
This is much like the previous example in that it requires an
understanding of how the address lines going to the chip select circuitry
and the memory device are required to behave. The previous example
showed that the address lines going to the memory device must be all
zero for the starting or low address and all ones for the ending or high
address. The address lines going to the chip select, however, must all
remain constant.
Let's begin by converting the low and the high addresses to binary.
a15 a 13 a 11 a9 a7 a5 a3 a1
a 14 a 12 a 10 a8 a6 a4 a2 a0
Low 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
High 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1
Example a27
a26
What is the address range of a25
the memory device that is enabled a24
with the chip select shown? a23
Chapter 12: Memory Organization 259
Solution
To begin with, the addressing can be determined from the subscripts
of the address lines identified in the figure. The address lines coming
out of the processor go from a0 (always assumed to be the least
significant bit of the address) to a27. This means that the processor has
28 address lines and can access a memory space of 228 = 256 Meg.
The chip select only goes low when all of the inputs to the NAND
gate (after the inverters) equal 1. This means that a27 = 0, a26 = 1,
a25 = 0, a24 = 0, and a23 = 1. We find the lowest address by setting all of
the remaining bits, a22 through a0, to zero and we find the highest
address by setting all of the remaining bits to 1. This gives us the
following binary addresses.
High address = 0100 1111 1111 1111 1111 1111 11112 = 4FFFFFF16
Low address = 0100 1000 0000 0000 0000 0000 00002 = 480000016
• have a much higher capacity due to the smaller size of the capacitor
(The RAM sticks of your computer's main memory are DRAMs.);
• will "leak" charge due to the nature of capacitors eventually causing
the data to disappear unless it is refreshed periodically;
• are much cheaper than SRAM; and
• are volatile meaning that the data is fixed and remains stored only
as long as power is available.
Time Æ
Cycle 1 Cycle 2 Cycle 3
Address lines: 1st half of addr. 2nd half of addr.
Data lines: No data No data Valid data
Number of
bits per
address-
able
location
Now let's get back to the issue of the delay added by a second
address cycle. Most data transfers to and from main memory take the
form of block moves where a series of instructions or data words are
pulled from memory as a group. (For more information on memory
blocks, see the section on caches in Chapter 13.)
If the processor needs a block from memory, the first half of the
address should be the same for all the items of the block. Because of
this, the memory access process begins with the row address then uses
only the column address for subsequent retrievals. This is called Fast
Page Mode (FPM), the data of a single row being referred to as a page.
The RAS line is held low as long as the row address is valid. Figure
12-13 presents an example of FPM for a memory block of size four.
Time Æ
Col. Col. Col. Col.
Row
Address: addr. addr. addr. addr.
addr
0 1 2 3
Data Data Data Data
No No No No No
Data: word word word word
data data data data data
0 1 2 3
Time Æ
Row Column Column Column Column
Address:
addr. addr. 0 addr. 1 addr. 2 addr. 3
No No Data Data Data Data
Data:
data data word 0 word 1 word 2 word 3
Problems
1. What is the largest memory that can have a starting or lowest
address of 16000016?
268 Computer Organization and Design Fundamentals
2. What are the high and low addresses of the memory ranges defined
by each of the chip selects shown below?
a.) a27 b.) a31 c.) a15
a26 a30 a14
a25 a29 a13
a24 a28 a12
a23 a27
registers
Increasing Increasing
capacity cache RAM(s) speed
main memory
long term storage, e.g., hard drive
269
270 Computer Organization and Design Fundamentals
Hard drives are the most cost-effective method of storing data. In
the mid-1980's, a 30 Megabyte hard drive could be purchased for
around $300 or about $10 per MB. In 2007, retailers advertised a 320
Gigabyte SATA Hard drive for around $80 or about $0.00025 per MB.
In other words, the cost to store a byte of data is almost 1/40,000th
cheaper today than it was a little over two decades ago.
Hard drives store data in well-organized patterns of ones and zeros
across a thin sheet of magnetic material. This magnetic material is
spread either on one or both sides of a lightweight, rigid disk called a
substrate. The substrate needs to be lightweight because it is meant to
spin at very high speeds. The combination of magnetic material and
substrate is called a platter.
The more rigid the substrate is, the better the reliability of the disk.
This was especially true when the mechanisms that were used to read
and write data from and to the disks were fixed making them prone to
scraping across the substrate's surface if the substrate was not perfectly
flat. The condition where the read-write mechanism comes in contact
with the disk is called a "crash" which results in magnetic material
being scraped away from the disk.
Substrates used to be made from aluminum. Unfortunately, extreme
heat sometimes warped the aluminum disk. Now glass is used as a
substrate. It improves on aluminum by adding:
Write
Magnetic
current
Write coating
head
It is possible to use the same head to read data back from the disk. If
a magnetized material moves past a coil of wire, it produces a small
current. This is the same principle that allows the alternator in your car
to produce electricity. The direction of the current generated by the
disk's motion changes if the direction of the magnetization changes. In
this way, the same coil that is used to write the data can be used to read
it. Just like the alternator in your car though, if the disk is not spinning,
no current is generated that can be used to read the data.
Newer hard drives use two heads, one for reading and one for
writing. The newer read heads are made of a material that changes its
resistance depending on the magnetic field that is passing under it.
These changes in resistance affect a current that the hard drive
controller is passing through the read head during the read operation. In
this way, the hard drive controller can detect changes in the magnetic
polarization of the material directly under the read head.
There is another characteristic of the read/write head that is
important to the physical operation of the hard drive. As was stated
earlier, the area that is polarized by the head is equal to the gap in the
write head. To polarize a smaller area thereby increasing the data
density, the gap must be made smaller. To do this, the distance between
the head and the platter must be reduced. Current technology allows
heads to "fly" at less then three micro inches above the platter surface.
272 Computer Organization and Design Fundamentals
When the magnetic material is deposited on a flexible substrate such
as a floppy diskette or a cassette tape, the flex in the material makes it
possible for the head to come in contact with the substrate without
experiencing reliability problems. This is not true for hard disks. Since
the platters are rigid and because the platters spin at thousands of
rotations per minute, any contact that the head makes with the platter
will result in magnetic material being scraped off. In addition, the heat
from the friction will eventually cause the head to fail.
These two issues indicate that the read/write head should come as
close to the platters as possible without touching. Originally, this was
done by making the platter as flat as possible while mounting the head
to a rigid arm. The gap would hopefully stay constant. Any defects or
warpage in the platter, however, would cause the head to crash onto the
platter resulting in damaged data.
A third type of head, the Winchester head or "flying head" is
designed to float on a cushion of air that keeps it a fixed distance from
the spinning platter. This is done by shaping the head into an airfoil that
takes advantage of the air current generated by the spinning platter.
This means that the head can operate much closer to the surface of the
platter and avoid crashing even if there are imperfections.
Polarity Æ Å Æ Å Æ Å Æ Å Æ Å Æ Å Æ Å
Reversals
Value 1 0 0 1 0 1 1 1 0
single bit time
Figure 13-3 Sample FM Magnetic Encoding
Polarity Æ Å Æ Å Æ Å Æ
Reversals
Value 1 0 0 1 0 1 1 1 0
single bit time
10 single
11 bit time
000
010
011
0010
0011
Figure 13-5 RLL Relation between Bit Patterns and Polarity Changes
Now the shortest period between polarity changes is one and a half
bit periods producing a 50% increased density over MFM encoding.
Figure 13-6 presents the same sample data with RLL encoding.
Chapter 13: Memory Hierarchy 275
Polarity Æ Å Æ Å Æ Å
Reversals
Value 1 0 0 1 0 1 1 1 0
Transfer
time
Rotational
latency
Desired
sector
Seek time
Next, let's determine the transfer time for a single sector. If it takes
8.3 milliseconds for a complete revolution of the spindle, then during
that time 500 sectors pass beneath the head. This means that a sector
passes beneath the head every 8.3 ms/rotation ÷ 500 sectors/rotation =
16.7 microseconds/sector. This can also be calculated using the
expression presented above.
Sectors
Intertrack gaps
Intersector gaps
Cylinder
Platters
Tracks
The problem with doing this is that as the read-write head moves to
the outer tracks, the rate at which the bits pass under the head increases
dramatically over that for the smallest track. This contrasts with the
fixed number of bits per track which has the same data rate regardless
of the position of the read-write head. This means that the hard drive
with the equal sized bits requires a more complex controller.
Regardless of how the bits are arranged on the platters, the number
of bits per sector must remain constant for all tracks. Since partial
sectors are not allowed, additional bits cannot be added to tracks further
from the spindle until a full sector's worth of bits can be added. This
creates "zones" where groups of neighboring tracks have the same
number of sectors, and therefore, the same number of bits. This method
is called Zone Bit Recording (ZBR). Figure 13-12 compares CAV with
ZBR.
The reason the SRAM of the cache needs to be small is that larger
address decoder circuits are slower than small address decoder circuits.
The larger the memory is, the more complex the address decoder
circuit. The more complex the address decoder circuit is, the longer it
takes to select a memory location based on the address it received.
Therefore, making a memory smaller makes it faster.
It is possible to take this concept a step further by placing an even
smaller SRAM between the cache and the processor thereby creating
two levels of cache. This new cache is typically contained inside of the
processor. By placing the new cache inside the processor, the wires that
connect the two become very short, and the interface circuitry becomes
more closely integrated with that of the processor. Both of these
conditions along with the smaller decoder circuit result in even faster
data access. When two caches are present, the one inside the processor
is referred to as a level 1 or L1 cache while the one between the L1
cache and memory is referred to as a level 2 or L2 cache.
Processor Main
L2
L1 Memory
Cache
Cache (DRAM)
Code
Cache
Main
Processor Memory
Data (DRAM)
Cache
Example
How many blocks of 8 words are there in a 1 Gig memory space?
Solution
Eight words require three bits to uniquely identify their position
within a block. Therefore, the last three bits of the address represent the
word's offset into the block. Since a 1 Gig (230) address space uses 30
address lines, there are 30 – 3 = 27 remaining bits in the address. These
bits are used to identify the block. Below is a diagram of the logical
organization of the address.
Memory
Cache Block 0
Block 1
Line 0 Tag0 Block for Tag0
Line 1 Tag1 Block for Tag1 Block 512
Line 2 Tag2 Block for Tag2 Block 513
Line 3 Tag3 Block for Tag3
Block 1024
Block 1025
Line 511 Tag511 Block for Tag511
Block 1536
Block 1537
As with locating a word within a block, bits are taken from the main
memory address to uniquely define the line in the cache where a block
should be stored. For example, if a cache has 29 = 512 lines, then a line
would need 9 bits to be uniquely identified. Therefore, the nine bits of
the address immediately to the left of the word identification bits would
identify the line in the cache where the block is to be stored. The bits of
292 Computer Organization and Design Fundamentals
the address not used for the word offset or the cache line would be used
for the tag. Figure 13-20 presents this partitioning of the bits.
Once the block is stored in the line of the cache, the tag is copied to
the tag location of the line. From the cache line number, the tag, and the
word position within the block, the original address of the word can be
reconstructed.
Example
Assume a cache system has been designed such that each block
contains 4 words and the cache has 1024 lines, i.e., the cache can store
up to 1024 blocks. What line of the cache is supposed to hold the block
that contains the word from the twenty-bit address 3A45616? In
addition, what is the tag number that will be stored with the block?
Solution
Start by dividing the address into its word id, line id, and tag bits.
Since 4=22, then the two least significant bits identify the word, i.e.,
w = 2. Since the cache has 1024=210 lines, then the next 10 bits identify
the line number where the data is supposed to be stored in the cache,
i.e., l = 10. The remaining t = 20 – w – l = 8 bits are the tag bits. This
partitions the address 3A45616 = 001110100100010101102 as follows:
00111010 0100010101 10
tag bits line id bits word
id bits
Chapter 13: Memory Hierarchy 293
Therefore, the block from address 3A45416 to 3A45716 will be stored
in line 01000101012 = 27710 of the cache with the tag 001110102.
Example
The first 10 lines of a 256 line cache are shown in the table below.
Identify the address of the data that is shaded (D816). For this cache, a
block contains 4 words. The tags are given in binary in the table.
Solution
Start by finding the number of bits that represent each part of the
address, i.e., the word id, the line id, and the tag. From the table, we can
see that 2 bits represent the positions of each of the four words in a
block and that 6 bits are used to represent the tag.
Since the cache has 256=28 lines, then the line number in the cache
is represented with 8 bits, and the address is partitioned as follows:
The shaded cell in the table has a tag number of 1100112. The line
number is 4, which in 8 bit binary is 000001002. Last of all, the word is
in the third column which means that it is the 102 word within the
block. (Remember to start counting from 002.) Putting the tag, line id,
and word id bits together gives us:
294 Computer Organization and Design Fundamentals
110011 00000100 10
tag bits line id bits word
id bits
Therefore, the address that the shaded cell containing D816 came from
is 1100 1100 0001 00102 = CC1216.
Example
Using the table from the previous example, determine if the data
stored in main memory at address 101C16 is contained in this cache,
and if it is, retrieve the data.
Solution
Converting 101C16 to binary gives us 0001 0000 0001 11002. By
using the breakdown of bits for the tag, line id, and word id, the binary
value can be divided into its components.
000100 00000111 00
tag bits line id bits word
id bits
From this we see that the line in the cache where this data should be
stored is 000001112 = 710. The tag currently stored in this line is
0001002 which equals the tag from the above partitioned address.
Therefore, the data from main memory address 101C16 is stored in this
cache. If the stored tag did not match the tag pulled from the address,
we would have known that the cache did not contain our address.
Lastly, we can find the data by looking at the offset 002 into the
block at line 7. This gives us the value FE16.
Example
Using the same table from the previous two examples, determine if
the data from address 982716 is in the cache.
Solution
Converting the hexadecimal address 982716 to binary gives us
982716 = 1001 1000 0010 01112. By using the breakdown of bits for the
tag, line id, and word id, we can divide this value into its components.
Chapter 13: Memory Hierarchy 295
100110 00001001 11
tag bits line id bits word
id bits
From this we see that the tag is 1001102, the line number is
000010012 = 910, and the word offset into the block is 112. Looking at
line number 9 we see that the tag stored there equals 1010002. Since
this does not equal 1001102, the data from that address is not contained
in this cache, and we will have to get it from the main memory.
t bits w bits
Bits identifying word
Tag offset into block
• Least Recently Used (LRU) – This method replaces the block that
hasn't been read by the processor in the longest period of time.
• First In First Out (FIFO) – This method replaces the block that
has been in cache the longest.
• Least Frequently Used (LFU) – This method replaces the block
which has had fewest hits since being loaded into the cache.
• Random – This method randomly selects a block to be replaced. It
has only slightly lower performance than LRU, FIFO, or LFU.
Example
The table below represents five lines from a cache that uses fully
associative mapping with a block size of eight. Identify the address of
the shaded data (C916).
Solution
The tag for C916 is 01000110101012. When combining this with the
word id of 0012, the address in main memory from which C916 was
retrieved is 01000110101010012 = 46A916.
Example
Is the data from memory address 1E6516 contained in the table from
the previous example?
Chapter 13: Memory Hierarchy 297
Solution
For this cache, the last three bits identify the word and the rest of the
bits act as the tag. Since 1E6516 = 00011110011001012, then 1012 is the
word id and 00011110011002 is the tag. Scanning the rows shows that
the fourth row contains this tag, and therefore the table contains the
data in which we are interested. The word identified by 1012 is 9E16.
Word
Tag bits Set ID bits ID bits
18 bits 9 bits 3 bits Direct mapping (1 line/set)
19 bits 8 bits 3 bits 2-way set associative (21 lines/set)
20 bits 7 bits 3 bits 4-way set associative (22 lines/set)
21 bits 6 bits 3 bits 8-way set associative (23 lines/set)
Example
Identify the set number where the block containing the address
29ABCDE816 will be stored. In addition, identify the tag and the lower
Chapter 13: Memory Hierarchy 299
and upper addresses of the block. Assume the cache is a 4-way set
associative cache with 4K lines, each block containing 16 words, with
the main memory of size 1 Gig memory space.
Solution
First, we need to identify the partitioning of the bits in the memory
address. A 1 Gig memory space requires 30 address lines. Four of those
address lines will be used to identify one out of the 16 words within the
block. Since the cache is a 4-way set associative cache, the number of
sets equals 4K lines divided by four lines per set, i.e., 1K = 210.
Therefore, ten address lines will be needed to identify the set. The
figure below represents the partitioning of the 30 bit address.
13.5 Registers
At the top of the memory hierarchy is a set of memory cells called
registers. A register is a group of latches that have been combined in
order to perform a special purpose. This group of latches may be used
to store an integer, store an address pointing to memory, configure an
I/O device, or indicate the status of a process. Whatever the purpose of
the register is, all of the bits are treated as a unit.
Registers are contained inside the processor and are integrated with
the circuitry used to perform the processor's internal operations. This
integration places registers within millionths of a meter of the action
resulting in very quick access times. In addition, the typical processor
contains fewer than a hundred registers making decoding very simple
and very fast. These two features combine to make registers by far the
fastest memory unit in the memory hierarchy.
Because of the integral part they play in computer architecture, the
details and applications of registers are presented in Chapter 15.
Problems
1. Why is it important for hard drive substrates to be rigid?
2. Why is it important for hard drive substrates to be lightweight?
3. What is the advantage of a Winchester head, and how is it
achieved?
4. Sketch the pattern of magnetic polarity found using the RLL
encoding of Figure 13-5 for the bit pattern 0110100110100110101.
5. Calculate the amount of time it would take to read a 2 Mbyte file
from a 15,000 RPM drive with a typical 4 ms seek time that has
500 sectors per track each of which contains 512 bytes. Assume
the file is stored sequentially and take into account the delays
incurred each time the drive must switch tracks.
6. Repeat the previous problem assuming the sectors of the file are
scattered randomly across the tracks of the platters.
7. How many blocks of 16 words are there in a 256 Gig memory
space? Draw the logical organization of the full address
identifying the block ID portion and the word offset portion.
8. Identify the line number, tag, and word position for each of the 30-
bit addresses shown below if they are stored in a cache using the
direct mapping method.
a.) Address: 23D94EA616 Lines in cache: 4K Block size: 2
b.) Address: 1A54387F6 Lines in cache: 8K Block size: 4
c.) Address: 3FE9704A16 Lines in cache: 16K Block size: 16
d.) Address: 54381A516 Lines in cache: 1K Block size: 8
13. Using the table from the previous problem, identify the data value
represented by each of the following addresses.
a.) 7635916 b.) 386AF16 c.) BC5CC16
14. Identify the set number, tag, and word position for each of the 30-
bit addresses stored in an 8K line set associative cache.
a.) Address: 23D94EA616 2-way cache Block size: 2
b.) Address: 1A54387F6 2-way cache Block size: 4
c.) Address: 3FE9704A16 8-way cache Block size: 16
d.) Address: 54381A516 4-way cache Block size: 8
15. Using the C declarations below of a simulated 256 line cache and a
64K memory, create two functions. The first function, bool
requestMemoryAddress(unsigned int address), takes as its
parameter a 16-bit value and checks to see if it exists in the cache.
If it does, simply return a value of TRUE. If it doesn't, load the
appropriate line of the cache with the requested block from
memory[] and return a FALSE. The second function, unsigned int
getPercentageOfHits(void), should return an integer from 0 to 100
representing the percentage of successful hits in the cache.
typedef struct {
int tag;
char block[4];
}cache_line;
cache_line cache[256];
char memory[65536];
CHAPTER FOURTEEN
Serial Protocol Basics
for the extra wires. At the system level, i.e., for signals going outside of
the computer, each additional data line requires an additional wire in
the connecting cable. This results in:
The data may be either the raw data sent from one device to another
or it may be an encapsulation of another packet. The datalink layer is
not concerned with the pattern of bits within the data portion of the
packet. This is to be interpreted at a higher layer in the network model.
Chapter 14: Serial Protocol Basics 307
The frame's only purpose is to get data successfully from one device to
another within a network.
By embedding packets from upper layers of the OSI network model
within packets or frames from lower layers, the implementations of the
different layers can be swapped as needed. As long as a packet from the
network layer gets from one logical address to another, it doesn't matter
whether it was carried on a datalink layer implemented with Ethernet,
dial-up, or carrier pigeon for that matter. The practice of embedding
packets and frames is called a protocol stack.
There are a number of examples of a packet being encapsulated
within the packet of another frame. Once again, this is the result of the
implementation of the layers of the OSI network model. For example, a
typical way to send a web page from a web server to a client is to begin
by partitioning the file into smaller packets that are ordered, verified,
and acknowledged using the transmission control protocol (TCP).
These TCP packets are then encapsulated into network layer packets
used to transport the message from one host to another. A common
protocol for this is the internet protocol (IP). The network layer
packets must then be encapsulated into packets used to transfer the data
across a specific network such as Ethernet. This places the TCP/IP
packets inside the packet of an Ethernet frame. Figure 14-1 shows how
the TCP packet is embedded within the IP packet which is in turn
embedded into the Ethernet frame.
Network layer
IP Header IP Packet
Datalink layer
Figure 14-1 Sample Protocol Stack using TCP, IP, and Ethernet
The final part of the frame, the trailer, usually serves two purposes.
The first is to provide error detection to verify that a frame has not been
corrupted during transmission. Typically, this is a CRC using a
predefined polynomial. (See Chapter 9) A second purpose of the trailer
may be to include a special bit sequence defining the end of the frame.
308 Computer Organization and Design Fundamentals
The frame starts with the preamble, the start delimiter, the
destination and source addresses, and the length. The preamble and
start delimiter are used to tell the receiving devices when the message
is going to start and to provide synchronization for the receive circuitry.
The preamble is 7 bytes (56 bits) of alternating ones and zeros
starting with a one. This bit pattern creates a "square wave" which acts
Chapter 14: Serial Protocol Basics 309
like a clock ensuring that all of the receiving devices are synchronized
and will read the bits of the frame at the same point in each bit time.
The preamble is immediately followed by a single byte equal to
101010112 called the start delimiter. The first seven bits of the start
delimiter continue the square wave pattern set up by the preamble. The
last bit, which follows a one in the bit sequence, is also equal to a one.
This pair of ones indicates to the receiving devices that the next portion
of the frame, i.e., the destination address, will start at the next bit time.
The source and destination addresses come next in the frame. These
are each 6 bytes long and they identify the hardware involved in the
message transaction. It is important not to confuse these addresses with
IP addresses which are assigned to a computer by a network
administrator. The Ethernet addresses are hardwired into the physical
hardware of the network interface card (NIC) and are unique to each
device. They are referred to as a Medium Access Control (MAC)
addresses and are loaded into the NIC by the manufacturer. They can
not be modified. The first three bytes of the MAC address identify the
manufacturer. If the destination address is all ones, then the message is
meant to be a broadcast and all devices are to receive it.
The next field in the frame is the 2-byte length field. The value in
this field represents the number of data bytes in the data field. With two
bytes, it is possible to represent a value from 0 to 65,535. The
definition of the IEEE 802.3 Ethernet, however, specifies that only
values from 0 to 1500 are allowed in this field. Since 150010 =
101110111002, a value which uses only 11 bits, 5 bits are left over for
other purposes. Ethernet uses these bits for special features.
The length field is followed by the data field which contains the
transmitted data. Although the number of data bytes is identified by the
two bytes in the length field, the definition of IEEE 802.3 Ethernet
requires that the minimum length of the data field be 46 bytes. This
ensures that the shortest Ethernet frame, not including the preamble and
start delimiter, is 64 bytes long. If fewer than 46 bytes are being sent,
additional bytes are used as padding to expand this field to 46 bytes.
These filler bytes are added after the valid data bytes. The value in the
length field represents only the valid data bytes, not the padding bytes.
The trailer of the Ethernet frame contains only an error detection
checksum. This checksum is a 4-byte CRC. The polynomial for this
CRC has bits 32, 26, 23, 22, 16, 12, 11, 10, 8, 7, 5, 4, 2, 1, and 0 set
resulting in the 33-bit value 1000001001100000100011101101101112.
310 Computer Organization and Design Fundamentals
Remember from Chapter 9 that the polynomial for a CRC is one bit
longer than the final CRC checksum.
Since any device can transmit data at any time, the Ethernet network
must have a method for resolving when two devices attempt to send
data at the same time. Simultaneous transmissions are called collisions,
and each device, even the ones that are transmitting data, can detect
when a collision occurs. It is assumed that no message survives a
collision and both devices are required to retransmit.
In the event of a collision, an algorithm is executed at each of the
devices that attempted to transmit data. The outcome of this algorithm
is a pseudo-random number specifying how long the device is to wait
before attempting to retransmit. The random numbers should be
different making it so that one of the devices will begin transmitting
before the other forcing the second device to wait until the end of the
first device's transmission before sending its own message.
Options
Source
Protocol Address
Fragment Offset
Identification
Type of Service
8 bits (one byte)
Version
The 8-bit field following the length field identifies the type of
service. In large networks, there are usually many paths that a packet
can take between two hosts. Each of these paths may have different
degrees of speed, reliability, security, and so forth. The type of service
field is used to specify the type of link across which a message is to be
sent.
The next two bytes define the total length of the message. It is
calculated by adding the number of bytes in the IP header to the
number of bytes contained in the packet that follows it. This value does
not count any bytes from the frame in which the IP packet may be
contained.
Sixteen bits would suggest that an Ethernet packet can be up to
65,535 bytes in length. Note, however, that many networks will not
allow packets of this size. For example, for an IP packet to be contained
within an Ethernet frame, the packet can be at most 1500 bytes long.
The two bytes following the total length field are referred to as the
identification field. They are used by the destination host in order to
group the fragments contained in multiple IP packets so that they can
be reassembled into a single datagram.
312 Computer Organization and Design Fundamentals
The next byte is divided into two parts. The first three bits represent
the flag field while the last five bits are coupled with the following byte
to create the fragment offset field. The first bit of the flag field is
reserved. The second bit indicates whether the datagram may be
partitioned into fragments. The last bit is used to identify the current
packet as the last fragment.
When a datagram has been partitioned into fragments, a mechanism
must be in place to reorder the fragments. This is due to the nature of
large networks in which different paths of varying duration can be
taken by different packets. This makes it impossible for the receiving
device to determine the order in which the packets were sent merely by
relying on the order in which they were received.
The fragment offset field contains thirteen bits which are used to
identify the starting position of a fragment within the full datagram.
Because the partitioning of datagrams must occur on 64-bit boundaries,
the value in the fragment offset field is multiplied by eight to determine
the fragment's offset in bytes. An offset of zero identifies the fragment
as the first fragment within the datagram.
It is possible for a packet to become lost or undeliverable within a
large network. Therefore, each packet is sent with a field identifying
how long the packet is allowed to remain on the network. This field is
referred to as the time to live field. Every time the packet encounters a
module on the network, the value in the time to live field is
decremented. If the value in this field is decremented to zero before
reaching its destination, the packet is discarded.
The next eight bits of the IP header identify the protocol of the
packet contained in the data field.
In order to verify the integrity of the IP header, a sixteen bit header
checksum is calculated and inserted as the next field. IP uses the one's
complement of the one's complement datasum discussed in Chapter 9.
Remember that a checksum was identified as being less reliable than a
CRC. The one's complement checksum, however, has proven adequate
for use with IP headers.
Since the checksum is part of the IP header, this field must be filled
with zeros during the checksum calculation to avoid a recursive
condition. In addition, since the time to live field changes as the packet
is routed across the network, the header checksum will need to be
recalculated as it traverses the network.
Chapter 14: Serial Protocol Basics 313
Checksum Options
Control Bits
& Padding
Acknowledgement
Number
Destination
Port
Source Sequence
Port Number
The first two of the ten fields of the TCP packet header identify the
service that sent the message (source port) and the service that is to
receive it (destination port) respectively.
The next field is a 32 bit sequence number. The sequence number
identifies position of each packet within a block of data using its offset
with respect to the beginning of the block. The receiver uses sequence
numbers to order the received packets. If packets are received out of
order, the sequence number indicates how many bytes to reserve for the
missing packet(s). Both the sender and the receiver keep track of the
sequence numbers in order to maintain the integrity of the data block.
Chapter 14: Serial Protocol Basics 315
reserved
FIN – Identifies last packet
SYN – Used to synchronize sequence numbers
RST – Used to reset connection
PSH – Request to sender to send all packets
ACK – Indicates use of acknowledgement scheme
URG – Indicates that urgent pointer field is valid
The next field, the window field, is used by the device requesting
data to indicate how much data it or its network has the capacity to
receive. A number of things affect the value in this field including the
amount of available buffer space the receiver has or the available
network bandwidth.
A sixteen bit checksum field is next. Like IP, it contains the one's
complement of the one's complement datasum. The difference is that
the sum is computed across three groups of bytes:
Destination Protocol
Address
One of the flags in the control block is the urgent flag (URG). By
setting this flag to a 1, the sending device is indicating that the data
block contains urgent data. When this happens, the sixteen bit field
following the checksum is used to identify where that data is contained
within the data block. This field is referred to as the urgent pointer
field. The value contained in this field is added to the sequence number
to identify the position of the urgent data within the packet.
As with the IP protocol, some packets have special requirements.
These requirements are identified in a variable length list of options.
These options occur in the header immediately after the urgent pointer
field. The options identify packet requirements such maximum receive
segment size.
Because the option field is variable in length, an additional field
referred to as the padding field must be added to ensure the length of
the header is a multiple of four bytes. (Remember that the data offset
field identifies the length of the header using integer multiples of 32
bits.) The padding field appends zeros to the header after the options
field to do this.
Chapter 14: Serial Protocol Basics 317
offset data
0000: 00 04 76 48 35 AD 00 B0 D0 C1 6B 31 08 53 45 00
0010: 00 53 6D F4 40 00 80 06 CC 3C C5 A8 1A 8C C5 A8
0020: 1A 97 17 0C 0D BE DE B1 57 C5 79 59 3E D4 50 18
0030: 42 18 B6 3E 00 00 00 B4 00 30 00 22 00 0E 00 00
0040: 00 05 1A 99 D6 04 DA DE 00 07 FC FF 20 DD 00 00
0050: 08 00 DA DE 09 04 02 FC FF 0E 00 00 FC FF 01 00
0060: 0D
Note that the captured data does not include the preamble, start
delimiter, or CRC of the Ethernet frame. In general, this information is
used by the network interface card for synchronization of the
electronics and error checking, but is not made available to the user.
Therefore, the frame shown above starts with the destination and source
MAC addresses of the Ethernet frame and ends with its data field.
From Figure 14-2, we see that the first six bytes after the start
delimiter represent the destination address. Therefore, the MAC
address of the destination card is 00:04:76:48:35:AD. Remember that
the first three bytes represents the manufacturer. The three bytes
00:04:76 represents 3Com® Corporation.
The next six bytes represent the source address, i.e., the device
sending the frame. In this case, the MAC address of the source is
00:B0:D0:C1:6B:31. 00:B0:D0 identifies the card as a NIC from Dell®
Computer Corporation.
The next two bytes, 08:53, identifies the both the frame type and the
length of the frame. Converting this value to a sixteen-bit binary value
gives us 085316 = 00001000010100112. The most significant 5 bits
represent the type, and in this case (000012) it is an IP version 4 type.
The least significant 11 bits represent the length of the data in the
318 Computer Organization and Design Fundamentals
frame. In this case, 000010100112 = 8310 indicating the data field of the
Ethernet frame contains 83 bytes.
Immediately after the length field of the Ethernet frame is the start
of the IP header. By using Figure 14-3 as a reference, the details of the
IP packet can also be revealed.
The first four bits identifies the IP version being used. In this case,
the first four bits of the byte 4516 equal 4, i.e., IP version 4. The next
four bits in this byte equal 5. Multiplying this value by four gives us the
length of the IP header: 20 bytes.
Next comes one byte identifying the type of service, i.e., the special
requirements of this packet. Zeros in this field indicate that this packet
has no special needs.
The next two bytes identify the total length of the IP packet, i.e., the
number of bytes in the IP header plus the number of bytes of data. A
5316 indicates that header and data together total 83 bytes. Subsequent
fields are:
To verify the checksum, divide the IP header into words (byte pairs),
and add the words together. Doing this for our message give us:
4500
0053
6DF4
4000
8006
CC3C
C5A8
1A8C
C5A8
+1A97
3FFFC
Chapter 14: Serial Protocol Basics 319
two RFCs used for this chapter are RFC 791 (the standard for IP) and
RFC 793 (the standard for TCP). These can be found on the web at the
following locations:
In the discussion of the Ethernet frame, it was shown how the first
three bytes of the MAC address identify the manufacturer. A list of
these manufacturer codes, called the Organization Unique Identifiers
(OUI), is also maintained by the Institute of Electrical and Electronics
Engineers (IEEE). It can be found at:
Problems
1. List the two primary causes for reduced reliability in a parallel
communication scheme.
2. In the IEEE 802.3 Ethernet frame format, what is the purpose of
the 7 byte preamble of alternating ones and zeros sent at the
beginning of the frame?
3. What is the binary value of the start delimiter of the IEEE 802.3
Ethernet frame?
4. True or false: The two-byte length field of the IEEE 802.3
Ethernet frame contains the length of the entire message including
preamble and start delimiter.
5. What are the minimum and maximum values that can be contained
in the length field of an IEEE 802.3 Ethernet frame?
6. What are the minimum and maximum lengths of the data field of
an IEEE 802.3 Ethernet frame?
Chapter 14: Serial Protocol Basics 323
15.2 Components
Before going into detail on how the processor operates, we need to
discuss some of its sub-assemblies. The following sections discuss
some of the general components upon which the processor is built.
15.2.1 Bus
As shown in Chapter 12, a bus is a bundle of wires grouped together
to serve a single purpose. The main application of the bus is to transfer
data from one device to another. The processor's interface to the bus
includes connections used to pass data, connections to represent the
address with which the processor interested, and control lines to
manage and synchronize the transaction. These lines are "daisy-
chained" from one device to the next.
The concept of a bus is repeated here because the memory bus is not
the only bus used by the processor. There are internal buses that the
processor uses to move data, instructions, configuration, and status
325
326 Computer Organization and Design Fundamentals
between its subsystems. They typically use the same number of data
lines found in the memory bus, but the addressing is usually simpler.
This is because there are only a handful of devices between which the
data is passed.
In this chapter we will introduce new control lines that go beyond
the read control, write control, and timing signals discussed in Chapter
12. These new lines are needed by the processor in order to service
external devices and include interrupt and device status lines.
15.2.2 Registers
As stated when they were introduced in Chapter 13, a register stores
a binary value using a group of latches. For example, if the processor
wishes to add two integers, it may place one of the integers in a register
labeled A and the second in a register labeled B. The contents of the
latches can then be added by connecting their Q outputs to the addition
circuitry described in Chapter 8. The output of the addition circuitry is
then directed to another register in order to store the result. Typically,
this third register is one of the original two registers, e.g., A = A + B.
Although variables and pointers used in a program are all stored in
memory, they are moved to registers during periods in which they are
the focus of operation. This is so that they can be manipulated quickly.
Once the processor shifts its focus, it stores the values it doesn't need
any longer back in memory.
The individual bit positions of the register are identified by the
power of two that the position represents as an integer. In other words,
the least significant bit is bit 0, the next position to the left is bit 1, the
next is bit 2, and so on.
For the purpose of our discussion, registers may be used for one of
four types of operations.
15.2.3 Flags
Picture the instrumentation on the dash board of a car. Beside the
speedometer, tachometer, fuel gauge, and such are a number of lights
unofficially referred to as "idiot lights". Each of these lights has a
unique purpose. One comes on when the fuel is low; another indicates
when the high beams are on; a third warns the driver of low coolant.
There are many more lights, and depending on the type of car you
drive, some lights may even replace a gauge such as oil pressure.
How is this analogous to the processor's operation? There are a
number of indicators that reveal the processor's status much like the
car's idiot lights. Most of these indicators represent the results of the
last operation. For example, the addition of two numbers might produce
a negative sign, an erroneous overflow, a carry, or a value of zero.
Well, that would be four idiot lights: sign, overflow, carry, and zero.
These indicators, otherwise known as flags, are each represented
with a single bit. Going back to our example, if the result of an addition
is negative, the sign flag would equal 1. If the result was not a negative
number, (zero or greater than zero) the sign flag would equal 0.
For the sake of organization, these flags are grouped together into a
single register called the flags register or the processor status register.
Since the values contained in its bits are typically based on the outcome
of an arithmetic or logical operation, the flags register is connected to
the mathematical unit of the processor.
One of the primary uses of the flags is to remember the results of the
previous operation. It is the processor's short term memory. This
function is necessary for conditional branching, a function that allows
the processor to decide whether or not to execute a section of code
based on the results of a condition statement such as "if".
The piece of code shown in Figure 15-1 calls different functions
based on the relative values of var1 and var2, i.e., the flow of the
program changes depending on whether var1 equals var2, var1 is
greater than var2, or var1 is less than var2. So how does the processor
determine whether one variable is less than or greater than another?
328 Computer Organization and Design Fundamentals
if(var1 == var2)
equalFunction();
else if(var1 > var2)
greaterThanFunction();
else
lessThanFunction();
15.2.4 Buffers
Rarely does a processor operate in isolation. Typically there are
multiple processors supporting the operation of the main processor.
These include video processors, the keyboard and mouse interface
processor, and the processors providing data from hard drives and
CDROMs. There are also processors to control communication
Chapter 15: Introduction to Processor Architecture 329
Instead of passing
data to processor B, Processor B reads
processor A stores data from the buffer
data in buffer Buffer as needed.
"memory queue"
Processor Processor
A B
Effects of
unbalanced throughput
are eased with buffer
back up where it left off when the subroutine is completed. The return
address is stored in this temporary memory.
The stack is a block of memory locations reserved to function as
temporary memory. It operates much like the stack of plates at the start
of a restaurant buffet line. When a plate is put on top of an existing
stack of plates, the plate that was on top is now hidden, one position
lower in the stack. It is not accessible until the top plate is removed.
The processor's stack works in the same way. When a processor puts
a piece of data, a plate, on the top of the stack, the data below it is
hidden and cannot be removed until the data above it is removed. This
type of buffer is referred to as a "last-in-first-out" or LIFO buffer.
There are two main operations that the processor can perform on the
stack: it can either store the value of a register to the top of the stack or
remove the top piece of data from the stack and place it in a register.
Storing data to the stack is referred to as "pushing" while removing the
top piece of data is called "pulling" or "popping".
The LIFO nature of the stack makes it so that applications must
remove data items in the opposite order from which they were placed
on the stack. For example, assume that a processor needs to store
values from registers A, B, and C onto the stack. If it pushes register A
first, B second, and C last, then to restore the registers it must pull in
order C, then B, then A.
Example
Assume registers A, B, and C of a processor contain 25, 83, and 74
respectively. If the processor pushes them onto the stack in the order A,
then B, then C then pulls them off the stack in the order B, then A, then
C, what values do the registers contain afterwards?
Solution
First, let's see what the stack looks like after the values from
registers A, B, and C have been pushed. The data from register A is
pushed first placing it at the bottom of the stack of three data items. B
is pushed next followed by C which sits at the top of the stack. In the
stack, there is no reference identifying which register each piece of data
came from.
Chapter 15: Introduction to Processor Architecture 331
Register A: 25
Top of stack
74 after pushes
Register B: 83
83
Register C: 74 25 Top of stack
before pushes
When the values are pulled from the stack, B is pulled first and it
receives the value from the top of the stack, i.e., 74. Next, A is pulled.
Since the 74 was removed and placed in B, A gets the next piece of
data, 83. Last, 25 is placed in register C.
Register A: 83
Top of stack
74 before pulls
Register B: 74
83
Register C: 25
25 Top of stack
after pulls
DATA
Processor ADDRESS
CONTROL
Internal
data bus
Internal
Memory To external
data bus
CPU
Data
Buffer
Configuration To external
Registers address bus
Address
Latch
To external
devices
I/O
Ports
Registers Instruction
Decoder
The first thing a compiler might do to create executable code for the
processor is to determine how it is going to use its internal registers. It
needs to decide which pieces of data require frequent and fast
operations and which pieces can be kept in the slower main memory.
First, the index i is accessed repeatedly throughout the block of
code, so the compiler would assign one of the data registers inside the
CPU to contain i. Depending on the size of the registers provided by
the CPU, it would only need to be an 8-bit register.
Second only to i in the frequency of their use are the values sum and
max. They too would be assigned to registers assuming that enough
registers existed in the CPU to support three variables. Since sum and
max are defined as integers, they would need to be assigned to registers
equivalent to the size of an integer as defined for this CPU. In the
Pentium processor, this would be a 32-bit register.
The data contained in array would not be loaded into a register, at
least not all at once. First of all, each element of array is accessed only
once, and it isn't even modified during that access. Second, and more
important, only a few special application processors have enough
registers to hold 100 data elements.
There is one element of array that will be stored in a register, and
that is the pointer or address that identifies where array is stored in
memory. Each time the code needs to access an element of array, it
multiplies the index i by the size of an integer, then adds it to the base
address of array. This provides a pointer to the specific element of
array in which the CPU is interested.
336 Computer Organization and Design Fundamentals
There are two things to notice about these steps. First, the steps are
very minimal. The instruction set that a CPU uses for its operation is
made from short, simple commands. The typical instruction for a CPU
involves either a single transaction of data (movement from a register
to a register, from memory to a register, or from a register to memory),
or a simple operation such as the addition of two registers.
The second thing to notice is that this simple sequence uses a two-
step process to handle program flow control. In section 15.2.3, it was
shown how a "virtual subtraction" is performed to compare two values.
This operation sets or clears the zero flag, the sign flag, the carry flag,
and the overflow flag depending on the relationship of the magnitude of
the two values. For our example, this virtual subtraction occurs in Step
5 where max is compared to the next value retrieved from array and in
Step 9 where i is compared to the constant 100.
Every compare is followed immediately by a conditional jump that
checks the flags to see if the flow of the program needs to be shifted to
a new address or if it can just continue to the next address in the
sequence. There are many more options for conditional jumps than
were presented in the processor flags section. For example, a
Chapter 15: Introduction to Processor Architecture 337
The only reason there are two different commands is to assist the
programmer by creating syntax that makes more sense linguistically.
LOADA VAR1
LOADB VAR2
ADDAB
STORA RESULT
342 Computer Organization and Design Fundamentals
02 5E00
05 5E01
08
01 5E02
02 5E 00 05 5E 01 08 01 5E 02
skip over the instruction at 100816 and execute the CNSTB 5 at address
100916. In a high-level language, the code above might look like the
following two instructions where the address of VAR is 123E16.
• Register-to-register transfers
• Register-to-memory or port transfers
• Memory or port-to-register transfers
• Memory or port-to-memory or port transfers
instructions are used to assign new values to this register so that control
can jump to a new position in the program. Some of the program
control instructions use the CPU's flags to determine whether a jump in
the code will be performed or not. These are the conditional jumps
described earlier. The following is a short list of some of the major
program control instructions:
15.7 Big-Endian/Little-Endian
In the previous section, some of the operands were 16-bits in length
and had to be broken into 8-bit values in order to be stored in memory.
It is not much of a problem to store numbers larger than the width of
the data bus in memory. By partitioning the value to be stored into
346 Computer Organization and Design Fundamentals
chunks that are the size of the data bus, the processor simply uses
sequential memory locations to store large values. For example, if a
processor with an 8-bit data bus needs to store the 32-bit value
3A2B48CA16, it uses four memory locations: one to store 3A16, one for
2B16, one for 4816, and one for CA16. When it retrieves the data, it reads
all four values and reconstructs the data in one of its registers. The
processor designer must ensure that the order in which the smaller
chunks are stored remains consistent for both reading and writing, or
the value will become corrupted. This should not be a problem.
It can become a problem, however, when data is being transferred
between processors that use different orders. Big-endian and little-
endian are terms used to identify the order in which the smaller words
or bytes are stored. Big-endian means that the first byte or word stored
is the most significant byte or word. Little-endian means that the first
byte or word stored is the least significant byte or word. The method
selected does not affect the starting address, nor does it affect the
ordering of items in a data structure.
• The internal data bus and the instruction pointer perform the fetch.
• The instruction decoder performs the decode cycle.
• The ALU and CPU registers are responsible for the execute cycle.
Once the logic that controls the internal data bus is done fetching the
current instruction, what's to keep it from fetching the next instruction?
It may have to guess what the next instruction is, but if it guesses right,
then a new instruction will be available to the instruction decoder
immediately after it finishes decoding the previous one.
Once the instruction decoder has finished telling the ALU what to
do to execute the current instruction, what's to keep it from decoding
the next instruction while it's waiting for the ALU to finish? If the
internal data bus logic guessed right about what the next instruction is,
then the ALU won't have to wait for a fetch and subsequent decode in
order to execute the next instruction.
This process of creating a queue of fetched, decoded, and executed
instructions is called pipelining, and it is a common method for
improving the performance of a processor.
Figure 15-7 shows the time-line sequence of the execution of five
instructions on a non-pipelined processor. Notice how a full fetch-
decode-execute cycle must be performed on instruction 1 before
instruction 2 can be fetched. This sequential execution of instructions
allows for a very simple CPU hardware, but it leaves each portion of
the CPU idle for 2 out of every 3 cycles. During the fetch cycle, the
instruction decoder and ALU are idle; during the decode cycle, the bus
interface and the ALU are idle; and during the execute cycle, the bus
interface and the instruction decoder are idle.
Figure 15-8 on the other hand shows the time-line sequence for the
execution of five instructions using a pipelined processor. Once the bus
interface has fetched instruction 1 and passed it to the instruction
decoder for decoding, it can begin its fetch of instruction 2. Notice that
the first cycle in the figure only has the fetch operation. The second
cycle has both the fetch and the decode cycle happening at the same
time. By the third cycle, all three operations are happening in parallel.
348 Computer Organization and Design Fundamentals
F D E F D E F D E F D E F D E
15 cycles
F – fetch cycle
Time
D – decode cycle
E – execute cycle
Instruction 1
F D E Instruction 2
F D E Instruction 3
F D E Instruction 4
F D E Instruction 5
F D E
7 cycles F – fetch cycle
D – decode cycle
Time E – execute cycle
For the pipelined architecture, it takes two cycles to "fill the pipe" so
that all three CPU components are fully occupied. Once this occurs,
Chapter 15: Introduction to Processor Architecture 349
Example
Compare the number of cycles required to execute 50 instructions
between a non-pipelined processor and a pipelined processor.
Solution
Using equations 15.1 and 15.2, we can determine the number of
cycles necessary for both the non-pipelined and the pipelined CPUs.
logic needs to load the instruction pointer with the two bytes following
the JMP = 1116 machine code to point to the next instruction to fetch.
There is one group of instructions for which there is no method to
reliably predict where to find the next instruction in memory:
conditional jumps. For our mock processor, this group of instructions
includes "Jump if equal" (JEQU), "Jump if first value is greater than
second value" (JGT), and "Jump if first value is less than second value"
(JLT). Each of these instructions has two possible outcomes: either
control is passed to the next instruction or the processor jumps to a new
address. The decision, however, cannot be made until after the
instruction is executed, the last cycle of the sequence. This is because
the flags from the previous instruction must be evaluated before the
processor knows which address to load into the instruction pointer.
There are a number of methods used to predict what the next
instruction will be, but if this prediction fails, the pipeline must be
flushed of all instructions fetched after the conditional jump. The bus
interface logic then starts with a new fetch from the address determined
by the execution of the conditional jump. Each time the pipeline is
flushed, two cycles are added to the execution time of the code.
A0 R W Function
0 0 1 Reading from device's status register
1 0 1 Reading from device's data register
0 1 0 Writing to device's configuration register
1 1 0 Writing to device's data register
X 1 1 No data transaction
352 Computer Organization and Design Fundamentals
By using the remaining address lines for the chip select, this I/O
device can be inserted into the memory map of the processor using the
processor's memory bus. This method of interfacing an I/O device to a
processor is called memory mapping. Figure 15-9 shows a basic
memory mapped device circuit that uses four addresses.
One or two
low-order
address lines
A0
Data lines A1
through which Memory
to pass data DO
Mapped I/O
Majority of D1 Device
address lines
define chip D7
select
chip select
R read control
W write control
Some processors add a second read control line and a second write
control line specifically for I/O devices. These new lines operate
independently of the read and write control lines set up for memory.
This does two things for the system. First, it allows the I/O devices to
be added to the main processor bus without stealing memory addresses
from the memory devices. Second, it makes it so that the I/O devices
are not subject to the memory handling scheme of the operating system.
Typically, there is a different set of assembly language instructions
that goes along with these new control lines. This is done to distinguish
a read or write with a memory device from a read or write with an I/O
device. Table 15-8 summarizes how the processor uses the different
read and write control lines to distinguish between an I/O device
transaction and a memory transaction.
Chapter 15: Introduction to Processor Architecture 353
Table 15-8 Control Signal Levels for I/O and Memory Transactions
15.9.2 Polling
The method used by the operating system and its software
applications to communicate with I/O devices directly affects the
performance of the processor. This is due to the asynchronous nature of
I/O. In other words, the I/O device is never ready exactly when the
processor needs it to be. For example, the processor cannot predict
when a user might press a key, a network connection is not as fast as
the processor that's trying to send data down it, and the mechanical
nature of a hard drive means that the processor will have to wait for the
data it requested. If an I/O interface is not designed properly, the
processor will be stalled as it waits for access to the I/O device.
There are four basic methods used for communicating with an I/O
device: polling, interrupts, direct memory access, and I/O channels. The
first of these, polling, is by far the lowest performer, but it is presented
here due to its simplicity.
When an I/O device needs attention from the processor, it usually
indicates this by changing a flag in one of its status registers. For
example, a network interface may have a bit in one of its status
registers that is set to a one when its receive buffer is full. If the
processor does not attend to this situation immediately, new incoming
data may overwrite the buffer causing the old data to be lost.
In the polling method, the processor continually reads the status
registers of the I/O device to see if it needs attention. There are two
problems with this method. First, data might be missed if the register is
not read often enough. Second, by forcing the processor to
354 Computer Organization and Design Fundamentals
15.9.3 Interrupts
The problems caused by using the polling method of communication
with an I/O device can be solved if a mechanism is added to the system
whereby each I/O device could "call" the processor when it needed
attention. This way the processor could tend to its more pressing duties
and communicate with the I/O device only when it is asked to. If each
call was handled with enough priority, the chance of losing data would
be greatly reduced.
This system of calling the processor is called interrupt driven I/O.
Each device is given a software or hardware interface that allows it to
request the processor's attention. This request might be to tell the
processor that new data is available to be read, that the device is ready
to receive data, or that a process has completed. The call to the
processor requesting service is called an interrupt.
It is as if someone was reading a book when the telephone rings.
The reader, concerned about keeping her place in the book, places a
book mark to indicate where she left off. She then answers the phone
and carries on a conversation while the book "waits" for her attention to
return. While chatting on the phone, the person notices the dog standing
at the door waiting to be let out. She tells the person on the other end of
the line, "Hold that thought, I'll be right back." After she lets out the
dog, she returns to the phone call, picks up where she left off. When
she finishes talking on the phone, she hangs up and returns to her
reading exactly where she left off.
The processor handles devices that need service in a similar way.
When the processor receives a device interrupt, it needs to remember
exactly what it was doing when it was interrupted. This includes the
current condition of its registers, the address of the line of code it was
about to execute, and the settings of all of its flags. It does this by
storing its registers and instruction pointer to the stack using pushes.
Once its current status is stored, the processor executes a function to
handle the device's request. This function is called an interrupt service
routine (ISR). There could be a single ISR for a group of devices or a
different ISR for each device. By using interrupts and ISRs, the
Chapter 15: Introduction to Processor Architecture 355
data goes through a two step process, a read from the device then a
store to memory, in order to complete a transfer.
It would be far more efficient for the data to be transferred directly
from the I/O device to memory. A process such as this would not need
to involve the processor at all. If the processor could remain off of the
bus long enough for the device to perform the transfer, the processor
would only need to be told when the transfer was completed. It could
even continue to perform functions that did not require bus access.
This type of data transfer is called direct memory access (DMA),
and although it still requires an interrupt, it is far more efficient since
the processor does not need to perform the data transfer. The typical
system uses a device called a DMA controller that is used to take over
the bus when the device needs to make a transfer to or from memory.
The controller either waits for a time when the processor does not need
the bus or it sends the processor a signal asking it to suspend its bus
access for one cycle while the I/O device makes a transfer.
A DMA transaction involves a three step process. In the first step,
the processor sets up the transfer by telling the DMA controller the
direction of the transfer (read or write), which I/O device is to perform
the transfer, the address of the memory location where the data will be
stored to or read from, and the amount of data to be transferred.
Once the processor has set up the transfer, it relinquishes control to
the DMA controller. As the I/O device receives or requires data, it
communicates directly with memory under the supervision of the DMA
controller. The last step comes when the transfer is complete. At this
point, the DMA controller interrupts the processor to tell it that the
transfer is complete.
Problems
1. List the types of registers utilized by the processor and describe
their operation.
2. Determine the settings of the zero flag, the carry flag, the overflow
flag, and the sign flag for each of the following 8-bit operations.
11. What type of instruction might force the processor to flush the
pipeline?
12. List the two benefits of using separate read/write control lines for
I/O devices instead of using memory mapped I/O.
13. What two problems does the polling method to monitor the I/O
devices have that are solved by interrupt-driven I/O?
14. What problem does non-DMA interrupt-driven I/O have that is
solved by DMA?
15. How would the 32-bit value 1A2B3C4D16 be stored in an 8-bit
memory with a processor that used big-endian? Little-endian?
CHAPTER SIXTEEN
Intel 80x86 Base Architecture
359
360 Computer Organization and Design Fundamentals
EU
control system
Flags
Connects to
General Registers
instruction
AH AL
ALU queue in BIU
BH BL
CH CL (Figure 16-2)
DH DL
SP
BP
DI Connects to data bus in
SI BIU (Figure 16-2)
IP
Example
If CX contains the binary value 01101101011010112, what value
does CH have?
Solution
Since the register CH provides the most significant 8 bits of CX,
then the upper eight bits of CX is CH, i.e., CH contains 011011012.
16.2.3 Flags
The flags of the 80x86 processor are contained in a 16-bit register.
Not all 16 bits are used, and it isn't important to remember the exact bit
positions of each of the flags inside the register. The important thing is
to understand the purpose of each of the flags.
Remember from Chapter 15 that the flags indicate the current status
of the processor. Of these, the majority report the results of the last
executed instruction to affect the flags. (Not all instructions affect all
the flags.) These flags are then used by a set of instructions that test
their state and alter the flow of the software based on the result.
The flags of the 80x86 processor are divided into two categories:
control flags and status flags. The control flags are modified by the
software to change how the processor operates. There are three of
them: trap, direction, and interrupt.
The trap flag (TF) is used for debugging purposes and allows code
to be executed one instruction at a time. This allows the programmer to
step through code address-by-address so that the results of each
instruction can be inspected for proper operation.
The direction flag (DF) is associated with string operations. In
particular, DF dictates whether a string is to be examined by
incrementing through the characters or decrementing. This flag is used
by the 80x86 instructions that automate string operations.
Chapter 15 introduced us to the concept of interrupts by showing
how devices that need the processor's attention can send a signal
interrupting the processor's operation in order to avoid missing critical
data. The interrupt flag (IF) is used to enable or disable this function.
When this flag contains a one, any interrupt that occurs is serviced by
the processor. When this flag contains a zero, the maskable interrupts
are ignored by the processor, their requests for service remaining in a
queue waiting for the flag to return to a one.
364 Computer Organization and Design Fundamentals
Example
How would the status flags be set after the processor performed the
8-bit addition of 101101012 and 100101102?
Solution
This problem assumes that the addition affects all of the flags. This
is not true for all assembly language instructions, i.e., a logical OR does
not affect AF.
Let's begin by adding the two numbers to see what the result is.
Chapter 16: Intel 80x86 Base Architecture 365
1 1 1 1
carry out 1 0 1 1 0 1 0 1
+ 1 0 0 1 0 1 1 0
0 1 0 0 1 0 1 1
The main purpose of the BIU is to take the 16-bit pointers of the EU
and modify them so that they can point to data in the 20-bit address
space. This is done using the four registers CS, DS, SS, and ES. These
are the segment registers.
For the rest of this book, we will use the following terminology to
represent these three values.
Segment
register points
to the base of a
64K block
• Code Segment (CS) – This register contains the base address of the
segment assigned to contain the code of an application. It is paired
with the Instruction Pointer (IP) to point to the next instruction to
load into the instruction decoder for execution.
• Data Segment (DS) – This register contains the base address of the
segment assigned to contain the data used by an application. It is
typically associated with the SI register.
• Stack Segment (SS) – This register contains the base address of the
stack segment. Remember that there are two pointer registers that
use the stack. The first is the stack pointer, and the combination of
SS and SP points to the last value stored in this temporary memory.
The other register is the base pointer which is used to point to the
block of data elements passed to a function.
• Extra Segment (ES) – Like DS, this register points to the data
segment assigned to an application. Where DS is associated with
the SI register, ES is associated with the DI register.
370 Computer Organization and Design Fundamentals
Example
If CS contains A48716 and IP contains 143616, then what is the
physical address of the next instruction in memory to be executed?
Solution
The physical address is found by shifting A48716 left four bits and
adding 143616 to the result.
Table 16-2 Summary of the 80x86 Read and Write Control Signals
Even though they use the same address and data lines, there are
slight differences between the use of memory and the use of I/O ports.
First, regardless of the generation of the 80x86 processor, only the
lowest 16 address lines are used for I/O ports. This means that even if
the memory space of an 80x86 processor goes to 4 Gig, the I/O port
address space will always be 216 = 65,536 = 64K. This is not a problem
as the demand on the number of external devices that a processor needs
to communicate with has not grown nearly at the rate of demand on
memory space.
The second difference between the memory space and the I/O port
address space is the requirement placed on the programmer. Although
we have not yet discussed the 80x86 assembly language instruction set,
the assembly language commands for transferring data between the
registers and memory are of the form MOV. This command cannot be
used for input or output to the I/O ports because it uses MRDC and
MWTC for bus commands. To send data to the I/O ports, the assembly
language commands OUT and OUTS are used while the commands for
reading data from the I/O ports are IN and INS.
Problems
Answer problems 1 though 7 using the following settings of the
80x86 processor registers.
375
376 Computer Organization and Design Fundamentals
other words, the programmer must take the place of the compiler by
converting abstract processes to the step-by-step processor instructions.
As with the compiler, the output of the assembler is an object file.
The format and addressing information of the assembler's object file
should mimic that of the compiler making it possible for the same
linker to be used to generate the final application. This means that as
long as the assembly language programmer follows certain rules when
identifying shared addressing, the object file from an assembler should
be capable of being linked to the object files of a high-level language
compiler.
The format of an assembly language program depends on the
assembler being used. There are, however, some general formatting
patterns that are typically followed. This section presents some of those
standards.
Like most programming languages, assembly language source code
must follow a well-defined syntax and structure. Unlike most
programming languages, the lines of assembly language are not
structurally interrelated. In a language such as C, for example,
components such as functions, if-statements, loops, and switch/case
blocks utilize syntax to indicate the beginning and end of a block of
code that is to be treated as a unit. Blocks of code may be contained
within larger blocks of code producing a hierarchy of execution. In
assembly language, there is no syntax to define blocks of code;
formatting only applies to a single line of code. It is the execution of
the code itself that is used to logically define blocks within the
program.
.MODEL memory_model
Table 17-1 presents the different types of memory models that can
be used with the directive. The memory models LARGE and HUGE
are the same except that HUGE may contain single variables that use
more than 64K of memory.
There are three more directives that can be used to simplify the
definition of the segments. They are .STACK, .DATA, and .CODE.
When the assembler encounters one of these directives, it assumes that
it is the beginning of a new segment, the type being defined by the
specific directive used (stack, data, or code). It includes everything that
follows the directive in the same segment until a different segment
directive is encountered.
The .STACK directive takes an integer as its operand allowing the
programmer to define the size of the segment reserved for the stack.
Chapter 17: Intel 80x86 Assembly Language 381
label ENDP
As with the SEGMENT directive, the labels for the PROC directive
and the ENDP directive must match. The attribute for PROC is either
382 Computer Organization and Design Fundamentals
NEAR or FAR. A procedure that has been defined as NEAR uses only
an offset within the segment for addressing. Procedures defined as FAR
need both the segment and offset for addressing.
Both the label and the expression are required fields with the EQU
directive. The label, which also is to follow the formatting guidelines of
the label field, is made equivalent to the expression. This means that
whenever the assembler comes across the label later in the file, the
expression is substituted for it. Figure 17-8 presents two sections of
code that are equivalent because of the use of the EQU directive.
ARRAY DB 12 DUP(?)
COUNT EQU 12
ARRAY DB COUNT DUP(?)
Figure 17-8 Sample Code with and without the EQU Directive
Chapter 17: Intel 80x86 Assembly Language 385
Both dest and src may refer to registers or memory locations. The
operand src may also specify a constant. These operands may be of
either byte or word length, but regardless of what they are specifying,
the sizes of src and dest must match for a single MOV opcode. The
assembler will generate an error if they do not.
Section 16.4 showed how the Intel 80x86 uses separate control lines
for transferring data to and from its I/O ports. To do this, it uses a pair
of special data transfer opcodes: IN and OUT. The opcode IN reads
data from an I/O port address placing the result in either AL or AX
depending on whether a byte or a word is being read. The OUT opcode
writes data from AL or AX to an I/O port address. Figure 17-10 shows
the format of these two instructions using the operand accum to identify
either AL or AX and port to identify the I/O port address of the device.
IN accum, port
OUT port, accum
The ADD opcode modifies the processor's flags including the carry
flag (CF), the overflow flag (OF), the sign flag (SF), and the zero flag
(ZF). This means that any of the Intel 80x86 conditional jumps can be
used after an ADD opcode for program flow control.
Many of the other data manipulation opcodes operate the same way.
These include logic operations such as AND, OR, and XOR and
mathematical operations such as SUB (subtraction) and ADC (add with
carry). MUL (multiplication) and DIV (division) are different in that
they each use a single operand, but since two pieces of data are needed
to perform these operations, the AX or AL registers are implied.
Some operations by nature only require a single piece of data. For
example, NEG takes the 2's-complement of a value and stores it back
in the same location. The same is true for NOT (bit-wise inverse),
DEC (decrement), and INC (increment). These commands all use a
single operand identified as dest.
Figure 17-12 Format and Parameters of NEG, NOT, DEC, and INC
Chapter 17: Intel 80x86 Assembly Language 387
Figure 17-13 Format and Parameters of SAR, SHR, SAL, and SHL
JMP
.
LBL01
.
;Always jump to LAB01
. .
. .
There is one last set of instructions used to control the flow of the
program, and although they were not mentioned in Chapter 15, they are
common to all processors. These instructions are used to call and return
from a procedure or function.
The CALL opcode is used to call a procedure. It uses the stack to
store the address of the instruction immediately after the CALL opcode.
This address is referred to as the return address. This is the address
that the processor will jump back to after the procedure is complete.
The CALL instruction takes as its operand the address of the
procedure that it is calling. After the return address is stored to the
stack, the address of the procedure is loaded into the instruction pointer.
To return from a procedure, the instruction RET is executed. The
only function of the RET instruction is to pull the return address from
the stack and load it into the instruction pointer. This brings control
390 Computer Organization and Design Fundamentals
PROC01: . .
;Beginning of procedure
. .
. .
Mnemonic Meaning
CLC Clear Carry Flag
CLD Clear Direction Flag
CLI Clear Interrupt Flag (disables maskable interrupts)
CMC Complement Carry Flag
STC Set Carry Flag
STD Set Direction Flag
STI Set Interrupt Flag (enables maskable interrupts)
The next two special instructions are PUSH and PULL. These
instructions operate just as they are described in chapters 15 and 16.
The Intel 80x86 processor's stack is referred to as a post-increment/
pre-decrement stack. This means that the address in the stack pointer is
decremented before data is stored to the stack and incremented after
data is retrieved from the stack.
Chapter 17: Intel 80x86 Assembly Language 391
There are also some special instructions that are used to support the
operation of the Intel 80x86 interrupts. IRET, for example, is the
instruction used to return from an interrupt service routine. It is used in
the same manner as the RET instruction in a procedure. IRET,
however, is required for interrupts because an interrupt on the 80x86
pushes not only the return address onto the stack, but also the code
segment and processor flags. IRET is needed to pull these two
additional elements off of the stack before returning to the code being
executed before the interrupt.
Another special instruction is the software interrupt, INT. It is a
non-maskable interrupt that calls an interrupt routine just like any
hardware interrupt. In a standard PC BIOS, this interrupt has a full
array of functions ranging from keyboard input and video output to file
storage and retrieval.
The last instruction presented here may not make sense to the novice
assembly language programmer. The NOP instruction has no operation
and it does not affect any flags. Typically, it is used to delete a machine
code by replacing it and its operands with this non-executing opcode.
In addition, a sequence of NOPs can be inserted to allow a programmer
to write over them later with new machine code. This is only necessary
under special circumstances.
the address 1000 was coming from the data segment, the operand
would be identified as [DS:1000H].
The processor can also use the contents of a register as a pointer to
an address. In this case, the register name is enclosed in brackets to
identify it as a pointer. For example, if the contents of BX are being
used as an address pointing to memory, the operand should be entered
as [BX] or [DS:BX].
A constant offset can be added to the pointer if necessary by adding
a constant within the brackets. For example, if the address of interest is
4 memory locations past the address pointed to by the contents of BX,
the operand should be entered as [BX+4] or [DS:BX+4].
While this is not a comprehensive list of the methods for using a
memory address as an operand, it should be a sufficient introduction.
Figure 17-19 presents some examples of using addresses for operands.
.MODEL SMALL
.STACK 100H
.DATA
.CODE
MAIN PROC FAR
MAIN ENDP
END MAIN
• The first line contains the string ".MODEL SMALL". We see from
Table 17-1 that this tells the compiler to use one code segment less
than or equal to 64K and one data segment less than or equal to
64K. The program we are writing here is quite small and will easily
fit in this memory model.
• The next line, ".STACK 100H", tells the instructor to reserve 256
bytes (hexadecimal 100) for the stack.
• The next line, ".DATA", denotes the beginning of the data segment.
All of the data for the application will be defined between the
.DATA and .CODE directives.
• The next line, ".CODE", denotes the beginning of the code
segment. All of the code will be defined after this directive.
• "MAIN PROC FAR" identifies a block of code named main that
will use both the segment and offset for addressing.
• "MAIN ENDP" identifies the end of the block of code named
MAIN.
• "END MAIN" tells the assembler when it has reached the end of all
of the code.
The next step is to insert the data definitions and code that go after
the .DATA and .CODE directives respectively.
The first piece of code we need to write will handle some operating
system house keeping. First, we need to start the program by retrieving
the address that the operating system has assigned to the data segment.
This value needs to be copied to the DS register. We do this with the
two lines of code presented in Figure 17-21. These lines need to be
placed immediately after the MAIN PROC FAR line.
the O/S receives this interrupt, it knows that the application is finished
and can be removed from memory. Placing the lines from Figure 17-22
immediately before the line MAIN ENDP in the code will do this.
At this point, our skeleton code should look like that shown in
Figure 17-23.
.MODEL SMALL
.STACK 100H
.DATA
.CODE
MAIN PROC FAR
MOV AX,@DATA ;Load DS with assigned
MOV DS,AX ; data segment address
Figure 17-23 Skeleton Code with Code Added for O/S Support
RESULT = (A÷8) + B – C
Let's begin by defining what the data segment is going to look like.
Each of the variables, A, B, C, and RESULT, need to have a word-
sized location reserved in memory for them. Since the first three will be
used as inputs to the expression, they will also need to be initialized.
396 Computer Organization and Design Fundamentals
For the sake of this example, let's initialize them to 10410, 10010, and
5210 respectively. Since RESULT is where the calculated result will be
stored, we may leave that location undefined. Figure 17-24 presents the
four lines of directives used to define this memory.
A DW 104
B DW 100
C DW 52
RESULT DW ?
This code will be inserted between the .DATA and .CODE directives of
the code in Figure 17-23.
The next step is to write the code to compute the expression. Begin
by assuming the computation will occur in the accumulator register,
AX. The process will go something like this.
The last step is to insert this code after the two lines of code that
load the data segment register but before the two lines of code that
Chapter 17: Intel 80x86 Assembly Language 397
.MODEL SMALL
.STACK 100H
.DATA
A DW 104
B DW 100
C DW 52
RESULT DW ?
.CODE
MAIN PROC FAR
MOV AX,@DATA ;Load DS with assigned
MOV DS,AX ; data segment address
MOV AX,A ;Load A from memory
SAR AX,3 ;Divide A by 8
ADD AX,B ;Add B to (A/8)
SUB AX,C ;Subtract C from (A/8)+B
MOV RESULT,AX ;Store A/8+B-C to RESULT
MOV AX,4C00H ;Use software interrupt
INT 21H ; to terminate program
MAIN ENDP
END MAIN
Problems
1. What character/symbol is used to indicate the start of a comment in
assembly language for the assembler we used in class?
2. Which of the following four strings would make valid assembly
language labels? Explain why the invalid ones are not allowed.
ABC123 123ABC
JUMP HERE LOOP
7. Assume the register BX contains the value 2000h and the table to
the right represents the contents of a short portion of memory.
Indicate what value AL contains after each of the following MOV
instructions.
Address Value
mov al, ds:[bx] DS:2000 17h
mov al, ds:[bx+1] DS:2001 28h
mov ax, bx DS:2002 39h
mov ax, 2003 DS:2003 4Ah
DS:2004 5Bh
DS:2005 6Ch
K modified frequency
Karnaugh map, 126 modulation, 273
Karnaugh map rules, 131 Moore machine, 237
most significant bit, 20
L MP3, 7
MSB. See most significant bit
latches
multiplexer, 156
D latch, 209, 223, 242, 262
edge-triggered, 210
N
S-R latch, 209
transparent latches, 211 NAND gate, 120, 160, 205, 256
leakage current, 263 NAND-NAND Logic, 119
least significant bit, 20, 34, 165 negative-going pulse, 10
LED. See light emitting diode network interface card, 309
LIFO, 330 network layer, 304, 310, 313
light emitting diode, 13, 147, next state truth table, 231
162 nibble, 20, 34
linker, 375 NIC. See network interface card
logic gates, 71 noise, 6
low level formatting, 283 non-periodic pulse trains, 10
LSB. See least significant bit NOT gate. See inverter
NOT rule, 96
M Nyquist Theorem, 33
MAC address, 309, 321
O
machine code, 338
maximum, 55 object file, 375
Mealy machine, 237 OF. See overflow flag
memory offset address, 367
address, 242 one's complement, 46
asynchronous, 266 one's complement
cell, 203 checksum/datasum, 176,
hierarchy, 269 312, 319
magnetic core, 241 Open Systems Interconnection
map, 248, 259, 352 Model, 303, 307
model, 380 OR gate, 73, 74, 90, 109, 114
processor, 332 OR rules, 96
space, 249 O/S level formatting, 283
synchronous, 267 OSI model. See Open Systems
volatile, 245 Interconnection Model
minimum, 55 output truth table, 231
406 Computer Organization and Design Fundamentals
408