CS1Bh Lecture Note 7 Compilation I: Java Byte Code
CS1Bh Lecture Note 7 Compilation I: Java Byte Code
CS1Bh Lecture Note 7 Compilation I: Java Byte Code
Computer Science 1 Bh
Computer Science 1 Bh
relatively small when compared to the compiled program representations of programs which do not use an intermediate byte code format. This is because many of the library methods which need to be included in the executable form of non-Java programs do not need to be included in Java byte code. They are provided only once in the Java Virtual Machine itself, not once in every compiled program. Thus the size of a Java application executable might more typically be measured in kilobytes whereas the size of a non-Java application executable is often measured in megabytes. The size of an executable le is an issue for the modern, networked applications of today. If part or all of the programs code will need to be downloaded over a computer network before it is executed then keeping the sizes of compiled programs small will provide a signicant benet. Communication over information over a network is frequently the bottleneck in a computer system, taking much longer to perform than disk input/output or memory loads and stores. Against its benets, the use of byte code brings a disadvantage. It is often argued that interpretation of byte code programs is much slower than execution of native code, compiled only for the machine which we are actually using. Users of computer programs want efcient products: it is frustrating to use a computer program which pauses during execution or which cannot keep up with the speed of user input. The approach used to combine the usefulness of byte code with the efciency of native code is called just-in-time compilation.
Computer Science 1 Bh
Computer Science 1 Bh
In general we think of a while loop and a for loop as being equivalent in that any iteration which can be coded using one of the loop constructs can also be coded using the other. In the case of the particular two loops above we consider them to be equivalent in that both initialise the loop control variable to zero and then go up in steps of one until the variable reaches ninety-nine. The Java byte code language has neither a for loop nor a while loop. It encodes iteration using conditional and unconditional jumps (by jump we mean the infamous goto statement which Dijkstra so disliked). The two forms of loops are equivalent in another sense in that after compilation they produce identical sequences of byte code instructions. Below we show the two byte code programs which are produced from these methods, inspected by the disassembler. (In an attempt to avoid confusion between the two languages, we use different type faces for them. We use typewriter font for Java, as always, and italic for Java byte code.) Method void for99( ) 0 iconst 0 1 istore 1 2 goto 8 5 iinc 1 1 8 iload 1 9 bipush 99 11 if icmplt 5 14 return Method void while99( ) 0 iconst 0 1 istore 1 2 goto 8 5 iinc 1 1 8 iload 1 9 bipush 99 11 if icmplt 5 14 return
These byte code instruction sequences manipulate a stack of operands and the memory where the values of variables are stored. The instructions iconst , iload and bipush push operands on top of the operand stack. The istore instruction and the iinc instruction update the memory. The instructions goto and if icmplt (if integer compare less than) cause transfers of control to the numbered line. There are six integer comparison instructions in Java byte code, if icmpeq, if icmpne , if icmplt , if icmple , if icmpgt and if icmpge (for equal, not equal, less than, less than or equal to, greater than, and greater than or equal to). The return instruction is just like its Java counterpart. We now trace through the bytecode program, line by line. Line 0 Line 1 Line 2 Line 5 Line 8 Line 9 Line 11 Line 14 The integer constant zero is pushed on top of the stack. The top of the stack is stored into variable number one (the variable i). Jump to line 8, avoiding incrementing i before the rst comparison. Increment local variable one by 1 (i++). Read the current value of i and push it on top of the stack. Push the integer constant 99 on top of the stack. Compare the top two items on the stack and jump if need be (i < 99). Return void when the end of the method is reached.
7.2.2
Of course different programs, even if they achieve the same effect, will usually give rise to different byte code sequences when compiled. We consider now two different ways of implementing a method to compute the absolute value of an integer (the absolute 4
Computer Science 1 Bh
value of an negative integer n is n whereas the absolute value of a positive integer n is n itself. We write two versions of a method to compute the absolute value of n. These versions are called absFirst() and absSecond(). The difference between them is whether we test for being negative or test for being positive. In the rst case we place the value n on the true limb of the conditional and the value n on the false limb. In the second case we instead place n on the true limb of the conditional and n on the false limb of the conditional. int absFirst(int n) { if (n < 0) return -n; else return n; } int absSecond(int n) { if (n > 0) return n; else return -n; }
These give rise to the following byte code sequences when compiled. Method int absFirst(int) 0 iload 1 1 ifge 7 4 iload 1 5 ineg 6 ireturn 7 iload 1 8 ireturn Method int absSecond(int) 0 iload 1 1 ie 6 4 iload 1 5 ireturn 6 iload 1 7 ineg 8 ireturn
The case of comparing with zero occurs so commonly in programs that specialised versions of the comparison operators are provided for this, ifeq, ifne , it , ifge , ifgt , ie . The instruction ineg negates the integer on the top of the stack. The instruction ireturn returns the integer result on top of the stack.
In the method plusPlusX(), x is rst incremented (line 0) and its new value is then loaded onto the operand stack (line 3). The integer which remains on the top of the stack when the method returns is the new value of x. In the method xPlusPlus(), x is rst loaded onto the operand stack (line 0) and its old value remains there while it is incremented (line 4). The integer which remains on the top of the stack when the method returns is the old value of x. 5
Computer Science 1 Bh
7.2.4
The control operators to break out of a loop or to continue with the next iteration have simple translations in Java byte code instructions. Each causes a transfer of control, the break to the next statement after the loop and the continue to the update operation which precedes the loop condition evaluation. The following example illustrates this process. Method void breakContinue( ) 0 iconst 0 1 istore 1 2 goto 17 5 iload 1 6 bipush 90 8 if icmpge 23 11 goto 14 14 iinc 1 1 17 iload 1 18 bipush 99 20 if icmplt 5 23 return
void breakContinue() { for (int i = 0 ; i < 99 ; i++) { if (i < 90) continue; else break; } }
Exercise: Compile the following Java program and then disassemble it and see if you can understand the workings of the bytecodes for the methods strict() and shortCircuit().
class BooleanExpressions { boolean strict(int x, int y) { return (x == 0) & (y == 0); } boolean shortCircuit(int x, int y) { return (x == 0) && (y == 0); } }
7.2.5
The simplest type of method to invoke in Java is a static method with no parameters. Below we show the Java source code for a class with three static methods and the relevant part of the compiled byte code for this class. The methods of the class are referred to by number so that method first() is #1, method second() is #2 and 6
Computer Science 1 Bh
method third() is #3. Returning an integer result from an integer method is achieved by leaving the integer result on top of the operand stack. class Methods Method int rst( ) 0 invokestatic #2 3 ireturn Method int second( ) 0 invokestatic #3 3 ireturn Method int third( ) 0 iconst 3 1 ireturn
class Methods { static int first() { return second(); } static int second() { return third(); } static int third() { return 3; } }
The operation of the method third() is the easiest to understand here. It simply places the integer constant 3 on top of the stack (using the iconst instruction) and returns it as its integer result (using ireturn ). The first() method simply invokes the second() method (using the invokestatic instruction) and then returns the result of that. The method second() is similar.
7.2.6
More often, we use methods when we have some data from which we want to compute a result. We invoke the method and pass the data as a parameter to the method. This means that in the byte code we rst see the parameter to the method being evaluated (Java calls by value ) and then the method is invoked. In method add2() below the expression i+1 is rst evaluated and then the method add1() is invoked. class Parameters { static int seven() { return add2(5); } static int add2(int i) { return add1(i + 1); } static int add1(int i) { return i + 1; } } class Parameters Method int seven( ) 0 iconst 5 1 invokestatic #2 4 ireturn Method int add2(int) Method int add1(int) 0 iload 0 0 iload 0 1 iconst 1 1 iconst 1 2 iadd 2 iadd 3 invokestatic #3 3 ireturn 6 ireturn
Seen from inside the method, formal parameters are simply numbered, just as local variables are. Thus the methods add2() and add1() refer to the integer variable numbered zero (using iload 0). 7
Computer Science 1 Bh
7.2.7
Something very signicant is missing from our view of Java Byte Code to this point, namely objects. Objects are manipulated in the Java Virual Machine as addresses, thus instead of iload we nd aload and similar instructions. If we do not mark the method same() below as being static then it is a non-static (or virtual ) method which can refer to the object with which it is associated using this. Method int same(java.lang.Object) 0 aload 0 1 aload 1 2 invokevirtual #4 5 ifeq 10 8 iconst 1 9 ireturn 10 iconst 0 11 ireturn This method loads two addresses onto the operand stack, address zero corresponding to this and address one corresponding to the formal parameter, o. It invokes the equals() method on these objects and tests the result. It returns 1 if the objects are equal and 0 if they are not. class Virtual { int same(Object o) { if (this.equals(o)) return 1; else return 0; } }
References
The Java Virtual Machine is described in the book The Java Virtual Machine Specication by Tim Lindholm and Frank Yellin, Addison-Wesley, Second edition, 1999.
Stephen Gilmore. Javier Esparza, February 6, 2003.