CA Classes-166-170

Computer Architecture Unit 7
One of the unusual characteristic from the desktop point of view is that the
programmer is allowed to state five autonomous operations that can be
issued simultaneously. In case the five autonomous instructions are not
available (which means that others are dependent), then no operations
(NOPs) are positioned in the remaining slots. We call this method of
instruction coding a VLIW (Very Long Instruction Word) method.
It is known that as Trimedia TM32 CPU comprise longer instruction words
and frequently includes NOPs, the instructions of Trimedia are compressed
in the memory. Also the instructions are decoded to the full size when they
are loaded into cache. In Figure 7.8, we have shown the TM32 CPU
instruction mix for EEMBC bench-marks.
Figure 7.8: TM32 CPU Instruction Mix for EEMBC Customer Benchmark
Manipal University of Jaipur B1648 Page No. 166

By means of source code which is unmodified, instruction mix is analogous

to others, even though more byte data transfers are there. For aligning the
data for SIMD instructions, the huge number of pack is observed and the
instructions are merged. Computers used for general purpose (having
higher importance byte data transfers) and the instruction mix for “out-of-
the-box” C code is considered similar to each other. The Single instruction,
multiple data (SIMD) instructions along with the pack are used by means of
the hand-optimised C code. Also the instructions are merged so to align the
data.
The comparative instruction mix for unmodified kernels is represented by
means of, middle column. On the other hand, modification at the C level is
allowed by right column. All the operations that were accountable for at least
1% of the total in any of the mixes are listed by these columns.
Self Assessment Questions
13. Trimedia processor may be the closest existing processor to a
____________________________.
14. State two uses of Trimedia TM32 CPU.
7.9 Summary
 Implementation of branching is done by using a branch instruction. The
address of target instruction is included in the branch instruction
 The branch penalty can be reduced to one cycle. It can be efficiently
reduced further by means of Delayed branch execution.
 Effective processing of branches has become a cornerstone of
increased performance in ILP-processors.
 Branch prediction is a method which is basically utilised for handling the
problems related to branch. Different strategies of branch prediction
include:
 Fixed branch prediction
 Static branch prediction
 Dynamic branch prediction
 The new architecture, generated mutually by means of Hewlett Packard
as well as Intel , is known as IA-64
 IA-64 model is also known as Explicitly Parallel Instruction Computing
(EPIC).

 Itanium comprises a group of 64-bit Intel microprocessors which

provides execution to the Intel Itanium architecture. This architecture
was initially known as IA-64.
 Interesting strategies are represented by the Crusoe chips and Trimedia
for applying the concepts of Very long instruction word (VLIW) in an
embedded space. Trimedia processor may be the closest existing
processor to a "classic" processor of VLIW.
7.10 Glossary
 Branch penalty: Wasteful work done by pipelines for a considerable
time.
 Condition code registers: A condition code register is used for
attaining communication among the instructions for condition as well as
branching.
 EPIC: Explicitly Parallel Instruction Computing.
 ILP: Instruction level parallelism.
 Merced: A dual mode processor, which is capable of executing the
programs of both IA-32 as well as IA-64.
 VLIW: Very Long Instruction Word.
7.11 Terminal Questions

1. Differentiate between unconditional and conditional branch.
2. Explain the concept of branch handling.
3. What do you understand by delayed branching?
4. Define branch processing.
5. What do you mean by branch prediction?
6. Write short notes on:
a) Fixed Branch Prediction
b) Intel IA-64 architecture
c) Static Branch Prediction
d) Itanium processor
e) Dynamic Branch Prediction
7. Explain the concept of Trimedia TM32 Architecture.

7.12 Answers
Self Assessment Questions
1. Jump instruction
2. False
3. Branch
4. Branch penalty
5. SPARC, MIPS
6. No operation (NOP)
7. Layout, micro-architectural implementation
8. a) Detecting branches
b) Handling of unresolved conditional branches during instruction
decoding.
c) Accessing the branch target path
9. Never taken, always taken
10. Instruction opcode
11. Addresses, registers
12. Explicitly Parallel Instruction Computing (EPIC)
13. "Classic" VLIW processor.
14. Set top boxes and advanced televisions.
Terminal Questions
1. This type of branch is considered as the simplest one. It is used to
transfer control to a particular target. In conditional branches, if a
particular condition meets its requirements, then only the jump is
conducted. Refer Section 7.2.
2. Branch Handling is executed when the flow of control is altered. For
example branch requires special handling in pipelined processors.
Refer Section 7.3.
3. Delayed branching is the reduction of branch penalty to one cycle.
Refer Section 7.4.
4. Branch processing receives branch instructions and resolves the
conditional branches as early as possible. Refer Section 7.5.
5. Branch prediction predicts the outcome of branch. Refer Section 7.6.
6. a) In Fixed Branch Prediction, prediction is fixed. Refer Section 7.6.1.
b) The new architecture, generated mutually by means of Hewlett
Packard as well as Intel , is known as IA-64. Refer section 7.7.1.

c) This approach makes use of instruction opcode for predicting

whether the branch is taken. Refer section 7.6.2.
d) Itanium comprises a group of 64-bit Intel microprocessors which
provides execution to the Intel Itanium architecture. This
architecture was initially known as IA-64. Refer Section 7.7.2.
e) For making more accurate predictions, this approach considers
run-time history. Here the n branch executions of history are
considered and this information is used for predicting the next one.
Refer section 7.6.3.
7. TM32 CPU is considered as an example of multimedia applications.
The multimedia applications comprise significant parallelism in
managing the data streams Refer Section 7.8.
References:
 Hwang, K. Advanced Computer Architecture. McGraw-Hill.
 Godse, D. A. & Godse, A. P. Computer Organization. Technical
Publications.
 Hennessy, John L., Patterson, David A. & Goldberg David. Computer
Architecture: A Quantitative Approach, Morgan Kaufmann.
 Sima, Dezsö, Fountain, Terry J. & Kacsuk, Péter, Advanced computer
architectures - a design space approach. Addison-Wesley-Longman.
E-references:
 http://www.scribd.com/doc/46312470/37/Branch-processing,
 http://www.scribd.com/doc/60519412/15/Another-View-The-Trimedia-
TM32-CPU-151.

CA Classes-166-170

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CA Classes-166-170

Uploaded by

Copyright:

Available Formats

Computer Architecture Unit 7

Manipal University of Jaipur B1648 Page No. 166

By means of source code which is unmodified, instruction mix is analogous

Manipal University of Jaipur B1648 Page No. 167

 Itanium comprises a group of 64-bit Intel microprocessors which

7.11 Terminal Questions

Manipal University of Jaipur B1648 Page No. 168

Manipal University of Jaipur B1648 Page No. 169

c) This approach makes use of instruction opcode for predicting

Manipal University of Jaipur B1648 Page No. 170

You might also like