Java VM Architecture
Java VM Architecture
Java VM Architecture
INTRODUCTION
A Java virtual machine is a program which executes certain other programs, namely those containing Java bytecode instructions. A JVM provides an environment in which Java bytecode can be executed, enabling such features as automated exception handling etc Java bytecode is an intermediate language which is typically compiled from Java, but it can also be compiled from other programming languages. JVMs are available for many platforms, and the .class file compiled in one platform will execute in a JVM of another platform. This makes Java platform-independent. The Java virtual machine knows nothing of the Java programming language, only of a particular binary format, the class file format. A class file contains Java virtual machine instructions (or bytecodes) and a symbol table.
The Java virtual machine operates on two kinds of types: primitive types and reference types. There are, correspondingly, two kinds of values that can be stored in variables, passed as arguments, returned by methods, and operated upon: primitive values and reference values.
The values of
the returnAddress type are pointers to the opcodes of Java virtual machine instructions
Reference type. -A reference to an object is considered to have Java virtual machine type reference. -Values of type reference can be thought of as pointers to objects. -More than one reference to an object may exist. -Objects are always operated on, passed, and tested via values of type reference. Objects and Arrays Carry data declared by the class Composed of primitive data or references to other objects array is a special object with ISA support elements of an array must all be of the same primitive type or must all be references. If they are references, then they must all point to objects of the same type.
Data Storage
Global storage is main memory, where globally declared variables reside. Local storage is temporary storage for variables that are local to a method. Operand storage holds variables while they are being operated on by the functional instructions (arithmetic, logical, shifts).
All storage is divided into cells or slots, where a cell or slot can usually hold a single data item.
1.STACK
Java virtual machine stacks either to be of a fixed size or to dynamically expand and contract as required by the computation. Local and operand storage are allocated on the stack, and procedure arguments are passed on the stack as well. Instructions can never place arrays and objects on the stack; only references and individual array elements can be placed on the stack. As each method is called, a stack frame is allocated with -arguments, local storage, and operand storage being allocated in that order. Local storage for a given method is of fixed size; the proper amount of stack space required for local storage can be determined at compile time.
Arguments Arguments and local variables: numbered from 0 Frame data holds return address, data for CP resolution, and exception table Local variables Frame data Operand stack
3.GLOBAL MEMORY
The logical main memory architecture in Java contains a Method area - that is shared among all Java virtual machine threads - For holding code. A global storage area for holding arrays and objects. The global memory area is managed as a heap of unspecified size with respect to the JVM architecture. The heap is the runtime data area from which memory for all class instances and arrays is allocate When an object is dynamically created on the heap, a reference is generated to point to it at that time. Objects in the heap can only be accessed via a reference having a type that matches the type of the object.
2.Constant Pool
Instructions often use constant values, as integer operands or as addresses in local memory, for example. It contains several kinds of constants, -ranging from numeric literals known at compile-time to method and field references that must be resolved at runtime The ISA allows some constant values to be placed in the instruction stream as immediate operands. But in general the constants have a range of lengths, and some of them are used by a number of different instructions. So to make the Java ISA a little more compact and uniform, constant data associated with a program is placed in a block known as the constant pool . Any instructions that need constant values can then index into the constant pool to retrieve them .
Constant Pool
Memory Hierarchy
In the figure, an array has been allocated on the heap, and a reference to the array is included as part of another object.
In the lower left corner is an object with no references pointing to it. This is an object ready to be garbage collected.
A specific field within a referenced object is accessed via an offset contained in the constant pool.
Java ISA
The java ISA is stack based, Includes instruction definitions (bytecodes)
Instruction Set
Bytecodes
- single byte opcode - zero or more operands
The index bytes are used as indices into the constant pool opcode or into local storage locations.
opcode index
opcode
data1
data2
A basic property of the Java instruction set is that each of the primitive types has specific bytecode instructions that can operate on them.
For example, The iadd opcode (integer add) is defined to operate only on integer operands on the stack. In a legal Java bytecode program, the operand types must match those required by the opcode.
For example, the pop instruction pops the top element from the stack and discards it.The swap instruction swaps the positions of the top two stack elements,and dup duplicates the top stack element to form the top two stack elements.
The third set of data-movement instructions moves values between the local storage area of the current stack frame and the operand stack.
These instructions specify the local storage address via a constant ~ either directly in the instruction or via an index to the constant pool.
For example, the iload_l instruction takes the integer from local storage slot 1 and pushes it onto the stack. The iload index instruction specifies the local storage slot number via a constant pool entry pointed to by the index.
The final set of data-movement instructions deal with global memory data, either objects or arrays.
- An object is created via the new index1 index2 instruction that concatenates two bytes to form an index into the constant pool. -The constant pool entry essentially specifies the object, and a new instance ofthe object is created on the heap and initialized. - A reference to the object is pushed onto the stack. -Similarly, the newarray type instruction creates an array containing elements of a specified primitive type.
There are instructions that can perform run-time checks to see what type of object is being pointed to by a reference.
For example, the checkcast i ndexl i ndex2 instruction indexes into the constant pool to find the specification for a specific class or interface. Then the object pointed to by a reference on the top of the stack is checked to see if it is an instance of the type specified by the constant pool entry. If not, a CheckCastException is thrown.
As with all data-movement instructions, these move data between the addressed data item and the operand stack.
The getstatic and putstatic instructions are similar, except they deal with static rather than dynamic objects. There are similar instructions that move data to and from arrays.
Type Conversion
- Some instructions convert one type of data item on the stack to another - Example: i2f
There are a number of instructions that take input operands, perform operations on them, and produce a result. For the most part, these instructions consist of a single byte. Operands are always taken from the stack and results are placed on the stack. For example : iadd, iand, ishfl
lookupswitch default1 default2 default3 default4 npairsl npairs2 npairs3 npairs4 match1_1 match1_2 match1_3 match1_4 offset1_1 offset1_2 offset1_3 offset1_4 match2_1 match2_2 match2_3 match2_4 offset2_1 offset2_2 offset2_3 offset2_4 additional n-2 match/offset pairs
Methods are called via one of the invoke instructions, which take a statically defined set of arguments. which begins by indexing into a constant pool location that contains a description of the called method.
This description includes the address of the method, the number and types of arguments it takes, the number of locals it uses, and its maximum operand stack depth.
Arguments on the stack are checked to make sure they match the specified argument types. If they do, a stack frame of the appropriate size is allocated, and the arguments are pushed as locals onto the stack. Then there is a jump to the called method. The return PC is saved on a stack, but the return PC value is not accessible, other than indirectly through a return instruction.
A typical return instruction i return, which begins by popping an integer from the current stack frame before removing the current stack frame. The integer is then pushed back onto the stack (for use by the calling method). Finally, there is a return jump to the calling method. The simple return instruction is used when there is a void return value.
This allows the loader to analyze the program prior to execution in order to check the types that are being moved to or from memory.ie the types of the values on the operand stack can be tracked through static program analysis.
This property also means that the maximum depth of the operand stack can be determined for each method at the time it is compiled.
It first pushes an integer A onto the stack. Then it tests B; if B is equal to zero,it pushes a second integer, C, onto the stack. Otherwise, it pushes the integer F onto the stack. At that point the two control paths reconverge and the top two stack elements are added and then stored to D. The key point is that when the two control paths reconverge, the operand stack has two integers regardless of the path taken.
If B is equal to zero, the integer C is pushed onto the operand stack; then integer D is pushed onto the stack regardless of the value of B. Next, the code sequence tests E. If E is zero, the top two stack elements are added; otherwise nothing is done. Finally, the top of the stack is stored to local memory location F. This code has the property that if B is equal to zero, then the stack has two elements when the branch on the value of E is performed; otherwise the stack has only one element at the same point in the code. Now, it may be the case that E is zero whenever B is zero, and vice versa. The second code sequence is not a legal one.
Example
Java Program
Example
Example
Java Bytecode
Exceptions
Errors may be caused by limitations of the VM implementation or VM bugs. Exceptions, on the other hand, are caused by program behavior that occurs dynamically- as the program executes. Static checking catches many programming mistakes and oversights, but some types of behavior cannot be caught until run time. An example of an error is stackOverflowError, which indicates that the available stack space is exhausted Two common exceptions -are the NullPointerException and the ArrayIndexOutOfBoundsException, which are clearly described by their names.
Exception Table
Exceptions identified by table in class file
- address Range where checking is in effect - target if exception is thrown
operand stack is emptied
Binary Classes
the combination of code plus metadata is a binary class that is typically included in a classfile. the Java binary classes that form a complete program do not have to be loaded when a program is started . Rather, the binary classes can be loaded on demand, at the time they are needed by the program. Among other things, this saves bandwidth for loading binary classes that are never used, and it allows a Java program to start up quickly, using only some. For efficiency, the first time a binary class is used, it is parsed and placed into method memory, a part of the VM implementation.Then on subsequent calls, it is much more efficient to consult the preprocessed method information in the method memory. Either each component of a binary class is of a fixed size or the size is explicitly given immediately before the component contents.
Header Information. It consists first of some header information, beginning with a magic number that Identifies this block of data as a binary class. The magic number is simply a character sequence that is the same for all Java binary classes. Following the header information is a sequence of large structures, each preceded by a size indication or the number of contained elements. The constant pool Access Flags This_class Super_class Interfaces
Field- This component contains the specification of the fields declared for this class. This information is included in a small table for each field; the table includes access information (public, private, protected), a name index (an offset intoa constant pool entry that contains the name of this field), a descriptor index(which contains an index into the constant pool where the descriptor for this field can be found), and attriibutes information. Methods: This component contains information regarding each method, e.g., the name and descriptor, as well as the method itself, encoded as a bytecode instruction stream. Each method can also have attribute tables, for example, giving the maximum operand stack depth for the method and the number of locals. Attributes.
On java side, these are standard binary classes; on the native side, they are program binaries in the native platform's machine code. Data on the Java side exists as objects and arrays on the heap and variables on the stack. On the native side, data is organized in whatever way the compiler happens to lay it out; i.e., it is compiler dependent.
The JNI provides an interface for a Java method to call a native method . To do this, the native method must be declared as native by the calling Java class. After compiling the Java class that declares the native method call, it can be given to a program javah, which will produce a header file for the native method. Then the header and native method code can be compiled to form the callable native method. The JNI specification allows control to be transferred back and forth between Java code and native methods; arguments can be passed and values can be returned. Furthermore, exceptions can be caught and thrown in the native code for handling in the Java application.
the JNI provides a number of native methods- for example, GetArrayLength will obtain the length of a Java array, and GetIntArrayElements allows the native code to obtain a pointer to a Java array's elements. Similarly, code on the native side can get and put object fielddata through JNI methods.