Please Write Your Answers in Above Answer Table
Please Write Your Answers in Above Answer Table
Please Write Your Answers in Above Answer Table
一、 There are 40 questions or uncompleted statements in this section. Beneath every subject there are a few phases or
statements marked A, B, C and D. Choose the statement which correctly answer the question, or the phrase that best
completes the sentence. (40) ( Please write your answers in above answer table. )
1. Amdahl’s Law states that the ____________ performance improvement to be gained from using some faster mode of
execution is limited by the fraction of the time the ____________ can be used. Amdahl’s Law defines the speedup that can be
gained by using a particular feature. The most speedup overall is limited by ______________.
A. overall \ enhanced mode \ 1/(1-F) B. enhanced mode \ overall \ 1/(1-F)
C. faster mode \ overall \ 1/(1-F) D. overall \ 1/(1-F) \ enhanced mode
4. The goal is to provide a memory system with cost almost level of memory and speed
almost level. The levels of the hierarchy usually subset one another. All data in one level is
also found in the level below, and all data in that lower level is found in the one below it, and so on
until we reach the bottom of the hierarchy.
A. as low as the cheapest \ as large as the fastest
B. as low as the fastest \ as fast as the cheapest
C. as low as the cheapest \ as fast as the fastest
D. as low as the fastest \ as low as the cheapest
B. The first-level cache should be large enough to obtain a small miss rate.
C. The second-level cache generally uses a bigger block size than that of the first-level cache.
D. The second-level cache should be large enough and use higher association to catch almost all
memory accesses in the second level.
9. According to the structure of recent compilers, loop transformations belong to which of the following part?
A. front end per language B. code generator C. global optimizer D. high-level optimization
13. Compared with the memory-memory architecture, the Register-register architecture has
A. Higher codes density. B. Less instructions to complete a function.
C. Lower CPI D. Large variation in instruction size.
14. In the following selections, which is NOT the measurement for reducing the cache miss rate ?
A. Higher associativity B. Pseudo-associative C. Victim cache D. Write buffer
17. As processes working, not all objects referenced by a program need to reside in main memory. If
the computer has , then some objects may reside on . The address space is usually broken
into fixed-size blocks, called . At any time, each resides either in main memory or on .
A. cache memory \ main memory \ blocks \ block \ main memory
B. cache memory \ disk \ blocks \ block \ disk
C. virtual memory \ main memory \ pages \ page \ main memory
D. virtual memory \ disk \ pages \ page \ disk
18. To reduce control hazards, we always bring the calculation of branch destination from ____ to _____.
A. EX, ID B. MEM, EX C. EX, IF D. MEM, ID
21. The destination address of a control flow must be specified explicitly in the vast majority of cases, which of the following
instruction is the major exception?
A. procedure call B. jump C. procedure return D. branch
23. The extension of MIPS pipeline to handle multi-cycle operation will bring about______ hazard.
A. WAW B. WAR C. RAW D. RAR
25. Which of the following policy will NOT improve virtual memory performance?
A. Write through B. full-associative map C. TLB cache D. LRU replacement
26. In the following selections, which is NOT a measurement for resolution of control hazard?
A. To calculate the branch destination address as in the earlier pipeline stages as possible.
B. Delayed branch.
C. To predict the branch untaken in case avoiding the untaken stalls when branch is really not taken.
D. Double bump.
27. Which of the following descriptions about the Average Selling Price (ASP) is NOT true?
A. The ASP means the component costs adding direct costs and gross margin.
B. If the average discount is cut from the list price, the left is ASP.
C. The Average selling price is just the list price.
D. The ASP is the money that comes directly to the company for each product sold.
28. To solve the control hazard in following instructions, ____ is the best choice to be put into the delay slot.
ADD R3, R1, R2------------------------------①
BNEZ R1, DES
< Delay Slot >
SUB R5, R4, R6-------------------------------②
DES: SUB R7, R9, R8-------------------------------③
A. It depends B. ③ C. ② D. ①. E. None
30. If one functional unit is not fully pipelined, it will lead to______ hazard. And the division of instruction-memory and data-
memory is aimed at solving ______ hazard.
A. Data, Structural B. Structural, Structural C. Control, Control D. Structural, Control
31. Which of the following statements about the causes of cache miss is correct?
A. The obvious way to reduce capacity misses is to increase capacity of the cache, while at the risk
of longer hit time and higher cost.
B. The larger the block size is, the better to decrease the conflict misses, because larger size take
better advantage of spatial locality.
C. Use larger block size can increase compulsory misses.
D. Higher associativity can be used to reduce conflict misses, and at the same time it decrease the
average memory access time too.
32. Which method can NOT be used to reduce cache miss penalty?
A. Multi-level caches B. Victim cache C. Pipelined cache access D. Nonblocking cache
33. To solve the data hazard in the following instructions, the bypassing from ______ to ______ is needed.
ADD R2, R3, R5: IF ID ① EX ② MEM ③ WB
SUB R1, R4, R2: IF ④ ID ⑤ EX ⑥ MEM WB
A. ② , ⑤ B. ① , ⑥ C. ③ , ⑥ D. ① , ⑤ E. ②, ⑥
37. In a cache-memory hierarchy system, assume that the memory size is 256MB, with a 4KB write back cache in 2-way
associative. The block size is 32B. Then the size of index field of physical memory address is
A. 5 bit B. 6 bit C. 7 bit D. 11 bit E. 17 bit
38. Assume there are M blocks in a cache, and every K blocks are grouped in one set, then which
following description is NOT correct?
A. If K=1, then it’s a direct mapped cache.
B. If K=1, then it’s a one-way set associative cache.
C. If K=M, then it’s a full-associative cache.
D. If K>1 and K<M, then it’s a M/K-way set associative cache.
39. Computer pioneers correctly predicted that programmers would want unlimited amounts of fast
memory. An economical and palmary solution to that desire is , which takes advantage of
5
40. Assume there is a code segment as following. And the elements in arrays are place in a row-and-
row order.
for (j = 0; j< 100; j = j+1)
for (i = 0; i < 5000; i = i+1)
x[i][j] = x[i][j] + C; /* C is a constant. */
Some one suggests to optimize the above code by exchanging the nesting of the loops as following:
for (i = 0; i < 5000; i = i+1)
for (j = 0; j< 100; j = j+1)
x[i][j] = x[i][j] + C; /* C is a constant. */
Which of the following statements is correct ?
A. The optimization can decrease cache misses by improving the spatial locality.
B. The optimization can decrease cache misses by improving the temporal locality.
C. The optimization can decrease cache misses by improving both the temporal locality and spatial
locality.
D. This measurement can not decrease cache misses at all.
2. Suppose the hardware implementation is the classic 5-stage RISC pipeline. Unconditional branch is resolved after the end of
ID stage, while the branch-target address is known at ID too. But the branch condition is evaluated till the end of EX stage.
The branch strategy is predict-taken. Then how many stalls must each type of instruction take?
3. A cache has 64-KB capacity, 128-bytes/line, and is 4-way set-associative. The system containing the cache uses 32-bit
addresses.
The cache has 512 lines. The cache has 128 sets.
4. In 2-way set-associative cache, assume cache has 4 blocks and each block is 1 word and 2 blocks per set. For instruction
LOAD R1, 0x18, is memory access misses? If the access misses, will replacement occur? And where is the location that the new
loaded block will be located?
Block A .
三、 Calculations(36)
1. (10)Your company is developing a program with high requirement on computation. You asked your R&D department to make
some improvements on the execution time. After several months, they give you two solutions. The first one is to use a new
hardware technology, by which 40% of the computation can be accelerated by 10 times. Another solution is focused on algorithm
design, which can enhance 60% and 10% of the total computation by 2 and 20 times respectively.
Question:
a) What is the overall enhancement of the hardware solution?
b) What is the overall enhancement of the software solution?
c) Which one will you choose?
Answer:
a):
b):
2. (13)Within some memory/cache memory hierarchies, there are 2 words in a block. Access time form Cache is 8ns and for
main memory miss penalty is 70ns. For the code of C language below, assumes that each element is one word in array ( A[i]).
Except array, another variables has be loaded to registers. While the C codes execute, please calculate and questions below:
for ( i=0; i<100; i++)
s=s+A[i] ;
(1) What is the miss rate for data accesses?
(2) What is the average memory access time for data read?
(3) What is the overall CPI including memory access? Assume processor runs at 1.1GHz and has a CPI of 1 excluding
memory accesses. Ignores instructions misses and data hazard and control hazard. Assumes assembler code is below:
………………
LOOP: LOAD R2, 0(R1)
ADD R5,R1,#4
7
Answer:
assumed condition:
Block 1 word/block
Access time of cache(hit time) 8ns
Access time of memory (miss penalty) 70ns
CPU clock rate 1GHz
Ideal CPI 1
All memory accesses 100
Clocks for one accesses time/T=time×f=70ns×1.1GHz=77
(1) For data accesses
Misses Accesses for even elements: A[0],A[2],…………………
There are 50 misses accesses
Miss rate for data is 50/100=50%
(question2)if the pipeline has one delay slot, how to adjust the code segment. Draw the pipe stage diagram.
答案 1:
答案 2:
ADDI R3, R3, #4
Loop: LW R1, 4(R2)
ADD R2, (R1)+
SUB R4, R1, R2
SW R2, 4(R4)+
BNEZ R3, Loop
ADDI R3, R3, #4 延时槽
ADDI R3, R3, #-4