## Fall 2003

### Homeworks

HW#1 due Thursday, 9/4
• problems in text:
• 1.3
• 1.10a,b -- use computer A as the reference for the geometric mean
• 1.11
• 1.12
• 1.21a (use Dell Precision Workstation 340 P4 1.5,1.7,1.8 GHz; typo:  should be SPECfloat_base2000)
• 1.1a (ECE629 only)
HW#2 due Thursday, 9/11
• Suppose you have a program with three non-overlapping parts.  I.e., E = t + u + v, where E is the total execution time and t, u, and v are the three parts.  Two of these parts, u and v, are sped up by factors s_u and s_v, respectively.
• (a)  Write the equation for the total speedup as a function of the two individual speedups.
• (b)  Suppose t, u, and v take up 20%, 60%, and 20%, respectively, of the processor time before speedup.  Suppose u is sped up by 10x and v by 20x.  What is the total speedup?
HW#3 due Thursday, 9/18
• problems in text:
• 2.3
• 2.6
• 2.11 (use gcc instruction mix -- do NOT average gap and gcc; do NOT worry that percentages do not add to 100%); categories:
• conditional branch (cond branch),
• jumps (jump, call, return),
• ALU (everything else:  add, sub, mul, compare, load imm, cond move, shift, and, or, xor, other logical)
HW#4 due Thursday, 9/25
• Translate the following C code to MIPS assembly language and machine code:
while (i != j) {
j = j+i;
i++;
}

Assume R6 contains i and R7 contains j.  Make the loop as tight as possible.  Hint:  The opcode for bne is 5.
• We wish to augment the single-cycle datapath of Figure 5.29 with the addi (add immediate) instruction.  Describe the additional control lines, if any, that would be needed to support this instruction, and give the truth table for all control lines for the instruction.
• Same as above but augment with bne (branch if not equal).
HW#5 due Thursday, 10/16
• problems in text:
• A.1 (use Figure A.5 instead of Figure A.6, i.e., don't draw diagram ;
use abbreviations F,D,X,M,W for stages, and S for stalls;
note:  the question is, "how many total cycles does this code take to execute?")
• A.3
• A.4
HW#6 due Thursday, 10/30
• problems in text:
• A.2  (note:  the question is, "how many total cycles does this code take to execute?")
HW#7 due Tuesday, 11/4
• handout:
• Assume add takes 2 clock cycles, multiply takes 10, and divide takes 40.  For the following code,
LD      F2, 21(R1)
MULTD   F8, F2, F3
DIVD    F3, F4, F2
SUBD    F9, F1, F2
• Fill in the scoreboard instruction status with the clock cycle number
• Fill in the scoreboard functional unit status and register result status with their values at clock cycle 5
• [optional] midterm #2 makeup due Tuesday, 11/18
• From MyCLE, get diagram (Fig. 6.51) and two problems.  Work one or both problems.  State on your paper the number of points you lost on the exam on each problem.
HW#8 due Tuesday, 11/25
• problems in text:
• 3.1a,b
• 4.8a

