Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
67 views

Java Processor

The document describes a proposed new architecture for a Java processor for embedded applications. It discusses how traditional Java Virtual Machines (JVMs) use a software stack that requires significant memory resources not suitable for embedded systems. The proposed processor replaces the stack with a Way Predictive Java Look Aside Buffer (WAY JLAB) to directly execute Java bytecodes in hardware. It also includes components like a core, variable method cache, and WAY JLAB to improve performance while meeting embedded memory constraints.

Uploaded by

chandandatta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Java Processor

The document describes a proposed new architecture for a Java processor for embedded applications. It discusses how traditional Java Virtual Machines (JVMs) use a software stack that requires significant memory resources not suitable for embedded systems. The proposed processor replaces the stack with a Way Predictive Java Look Aside Buffer (WAY JLAB) to directly execute Java bytecodes in hardware. It also includes components like a core, variable method cache, and WAY JLAB to improve performance while meeting embedded memory constraints.

Uploaded by

chandandatta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

THE JAVA PROCESSOR

Chandana Datta Uppalapati


Department of Computer Science and Engineering
Southern Methodist University
Dallas, Texas, USA

ABSTRACT
Java, one of the most powerful coding languages which is used for developing different
kinds of applications for different devices ranging from computers to set-top boxes. The JVM
(Java Virtual Machine) is considered as the heart of executing JAVA applications, this JVM
along with JIT(Just- in - Time) Complier are considered as the perfect combination for PC
related JAVA applications, but as for the JIT compiler memory requirement is very high to an
extent where it is extremely priceless for embedded systems like Internet Television's, Digital
STB's (Set Top Box) etc. This paper presents a new kind of stack less Java processor architecture
for embedded applications, which is capable of executing JAVA bytecodes directly on the
hardware architecture. This processor takes the advantages of not having a stack that is replaced
by Way Predictive Java Look Aside buffer (WAY JLAB) . This paper introduces a way
predictive JLAB, which provides fast access to the constant pool references for JVM byte code
than the Direct mapped Buffer, Set Associative Buffer.
1. INTRODUCTION
Java was a huge success because of its support to the security and portability. Although
there are many other reasons for its success [1], security and portability where the key factors.
An applet is a kind of Java program, which can be transmitted over the Internet and can
automatically be executed by a Java compatible web browser. An application which is
downloadable is more prone to virus which may gather our private information like credit card
details etc. by gaining unauthorized access to the system resources. But Java provides security
where it only confines the applet only to the Java execution environment and by not allowing it
to access the other parts of the computer. Java's Magic, the Bytecode, allows java to handle the
problems like security and the portability. Bytecode is the output obtained from the Java
compiler unlike the executable code obtained by the other programs, which is a set of highly
optimized instructions that can be executed by the Java run time system, the so called the Java
Virtual Machine (JVM). Only thing is that the Java Virtual Machine needs to be implemented for
each platform, any Java program can run on it. This is how portability is obtained. A JIT (Just -
In - Time) compiler for the bytecode is used in order to boost the performance. Also features like
[1] robustness, simple, dynamic, multithreaded, high performance made Java such a unique
language.
The JVM is responsible for the features like portability and security. The key internal
components present inside the JVM are the stack, Non-Heap memory, Heap memory. Thread is a
thread of execution in the program. Each thread will have its own stack and each stack consists
of a frames for each method executing on that thread. Each frame consists of different fields
return value, operand stack, reference to run-time constant pool of current class. Heap section of
the JVM is used to allocate arrays and instances of classes at run - time. As the size of the frame
is fixed after it is created so we cannot store objects and arrays. Frames only store references
that map to the arrays and objects on the heap. Arrays and objects can never be de-allocated,
instead garbage collector reclaims them automatically. Non- heap memory consists of code
cache, method area. Method area consists the Run time constant pool, field data, method data etc.
Bytecode requires data, as this data is too big to store directly in the bytecodes, data is stored in
the run time constant pool and the bytecodes contains reference to that constant pool. Different
types of data are stored in the constant pool, some of them are class references, method
references, field references, numeric and string literals. Code cache stores the methods that are
compiled to the native code by the Just - In - Time compiler. JIT compiles the areas of the
bytecodes that are regularly executed to the native code and this native code is stored in the code
cache.
The JIT compilation technique requires more memory and this cannot be supported by
the embedded implementations. So, one solution [2] that can be considered in order to improve
the execution performance is by using a Java Processor, where the JVM is implemented by the
hardware.
In the recent years many of the researchers focused on developing an efficient Java
processor for embedded applications. Section 2 of the paper shows the related work of
developing a Java processor. Section 3 introduces the proposed Java Processor, section 4
concludes the work by providing the advantages and disadvantages of the proposed Java
Processor along with the areas of scope for future work.


2. RELATED WORK
Many approaches were proposed for implementing the Java processor. Sun Microsystems
developed PicoJava I [3] and PicoJava II [4] and aJile systems Inc. developed aJ-100 [5] Java
processors, where all the three processors have same basic design principle of replacing the JVM
with a hardware stack based machine, as JVM is a software controlled stack based machine.
From [3-5] we can tell that, most of the Java Bytecodes which are simple were employed by the
hardware directly and rest of the instructions are employed by microcode or software traps.
Aurora VLSI Inc. developed a processor [6] where a Java processor attached to the host
processor inside the host processor core as a coprocessor so that the system can execute the
programs written in other programming languages using host processor and Java programs using
the attached Java coprocessor. Another respectable contribution to the Java processor field is by
M. Schoberl 's design of JOP (Java Optimized Processor) [9] where JVM is implemented in the
hardware for time predictable execution of the real-time tasks.
Stack dependent JVM's cannot support Instruction Level parallelism because they impose
data dependency among consecutive instructions [2]. M. Watheq El-Kharashi et al.[7] proposed
a method where stack dependency is eliminated by with the help of a hardware bytecode folding
algorithm along with Tomasulo's scheduling algorithm and based on that they designed a
JAFFARD processor [8] where it dynamically translates stack based bytecodes into RISC style
which are stack independent instructions. Even though coprocessors support Java applications
without any effect on the host processor they consume more chip area and high power
consumption which are important factors to be considered for embedded devices [2].
So, in order to overcome these issues related to stack, stack is replaced by a WAY JLAB
(Way Predictive Java Look Aside buffer) and the basic idea of this appears in [10,11]. Mainly
the design of the WAY JLAB is similar to the design proposed in [11] and the main difference
will be the usage of the Way predictive buffer instead of the Direct mapped buffer used in [11].
This Way predictive buffer will provide many advantages than compared to the Direct mapped
one.

3. THE PROPOSED JAVA PROCESSOR
The basic structure of the micro architecture is similar to that of the one proposed in [12]
but with different functionalities for method cache and the stack. Fig 1. shows the architecture of
the whole JVM design on the hardware referred to as micro architecture. Instead of method
cache I used a two block variable method cache proposed in [13]. The main purpose of using this
is instruction cache will be time predictable. For faster access of the constant pool and also to
reduce the effective object access time WAY JLAB is used.

Fig. 1 Micro architecture for proposed hardware based JVM
3.1 Core
Core can directly execute the java bytecode and also the exceptions raised by the user or
the exceptions raised by the system are also handled by the core itself. One of the main tasks of
the core is to either make sure whether all Java bytecodes are executed in constant time or based
on the information available execution time should be known. But in some cases like when
Bytecodes like INVOKEVIRTUAL arises then the called method may be un know and hence
their execution time will not be known. The core also provides an interface to the Garbage
Collector.
3.2 WAY JLAB
Constant pool is a part of the .class file (contains java bytecode) that contains constants
needed to run the code of that class. These constants include symbolic references generated by
compiler and literals specified by the programmer and this constant pool can be considered as a
table of variable length structures. Symbolic references are basically methods, names of classes
and fields referenced from the code. These kind of references are used by the Java Virtual
Machine to link code to other classes that it depends on. In order to speed up the constant pool
references, resolved information is stored in WAY JLAB. At the point of run time, symbolic
representation of the reference in the constant pool is used to calculate the location of actual
referenced entity, and this process is referred to as constant pool resolution [11].
Sun JVM [14] and Kaffe JVM uses different methods to store the resolved information.
Former converts the bytecode into _quick instructions where the offset part determines the field
offset of a specific object. Later takes a different approach, where it changes the constant pool
entry tag to show whether it is resolved or not and also updates constant pool entry with the
resolved reference. But these are not efficient strategies for small embedded systems since
memory is very precious. So, a associative buffer would be a better solution to resolve reference.
But Way predictive buffer is better than the associative buffer since way predictive buffer will
have better Hit time than compared to that of the associative buffers. Hence this buffer is named
as WAY JLAB (Way Predictive Java Look Aside Buffer). Fig. 2 shows the design of associative
Java Look Aside Buffer.

Fig. 2 Design of Associative Java Look Aside Buffer

From Fig. 2, Tag field of the class instance is compared with that of the tag's present in
the buffer. Now the data present at the index location for all entries are read and each of them is
considered as input for that three state buffer and then the compared tag lines are used to control
input to drive the three state buffer and the three state buffer which produces the valid data
output is the one which has a valid control input, that is, the one which has the tag comparison
hit. Here if we look at the access it is more because it has to compare all the tags, read all the
data from buffer and the use the compared tag lines to get the correct output. So, in order to
reduce the access time WAY JLAB is used. Fig. 3 shows the design of WAY JLAB.

Fig. 3 Design of WAY JLAB
Now here from the WAY JLAB, access time will be less because we use a prediction to
get the output. After that we compare it with the tag, if it is a HIT then we put data onto the Data
bus. If it is a MISS then only thing we have to do is that we have to go back and select the way
that is correct and get the data from that. Prediction from the Look up table drives the MUX1
(Multiplexer) to select the predicted way and then the data is passed to second MUX and it is
driven by the offset to select a particular data from that. So, by using this Way predicted JLAB
we reduce the access time to a great extent.
3.3 Variable Two Block Method Cache
This concept of Variable Two Block Method Cache was introduced by M.Schoeberl in
[13] to make the instruction cache time predictable. According to [13], typically Java programs
consists of shorter methods and also there are no branches out of the method. Here this cache is
filled only on calls and returns. If single method cache is used then, for example consider,


xyz( )
{
a( )
b( )
}
As the cache is accessed on calls and returns, method xyz( ) may be filled multiple times
when ever methods a( ) and b( ) returns. LRU replacement policy is used. First xyz () is invoked,
so it will be cached and then method a ( ) will be invoked and now xyz( ) will be replaced by a( )
in the cache, after the method a( ) returns xyz( ) will be cached again, now same process will
continue when method b() is invoked. This issue can be solved by caching two or more methods.
So, we will consider a cache which is capable of storing 2 methods.
We can use two replacement policies Next block and Stack oriented replacement policies.
Consider the following example
x ( )
{
for ( ; ; )
{
y( )
z( )
}
}
let x(), y(), z() block sizes be 2, 1, 2 and consider cache consists of four blocks. Next
block replacement policy uses a next pointer which points to the first block . When a method of
length l is loaded in the block n, next is updated to (n+l) % block count. Stack oriented block
replacement policy updates the next pointer in the same way as before on a method load. It is
also updated on a method return, so that it will point to the first block of the leaving method.
Instruction X() Y() Ret Z() Ret Y() Ret Z() Ret Y() Ret Z() Ret
Block 1 X x x Z z -->- -->- Z -->z Y Y y X
Block 2 X x x -->- X x x Z z -->- -->- Z -->z
Block 3 -->- Y y y X x x -->- X x x Z z
Block 4 - -->- -->- Z -->z Y y y X x x -->- X
Table. 1 Next Block replacement policy

Instruction X() Y() Ret Z() Ret Y() Ret Z() Ret Y() Ret Z() Ret
Block 1 X x x Z X x -->x Z -->z Y Y y y
Block 2 X x -->x Z -->- - - Z z -->- - -->- X
Block 3 -->- Y y y y -->y y -->y X x x Z X
Block 4 - -->- - -->- X x x - X x -->x Z -->-
Table. 2 Stack oriented block replacement policy
Above two tables show the content in the cache during for the program execution for
both the replacement policies. X,Y,Z show that they are first time loaded into the cache. x,y,z
represent they are present in the cache. Pointer --> points to the block which can be replaced on
method call or returns based on the replacement strategy used. In the cache design next block
replacement is used because when we reduce the block size of method z( ) to 1 block then in the
cache we can store all the methods but in the stack oriented methods will be still exchanging. So,
if we fit all the three methods into the cache then there will be no placement conflicts. So, Next
block replacement policy will be better option.
3.4 Memory Manager with Garbage Collector
Operations performed by the memory manager are executed in parallel with the CPU in
constant time. Memory manager is responsible for managing the Java heap by allocating objects
and then performing read and write operations on those and also freeing up memory used by
unreferenced objects. Memory is divided into different equal-sized segments to achieve constant
allocation time. Out of all the segments one of the segment is selected and it is called as the
current allocation segment which is used for allocation of all the new objects, when the size left
in the current allocation segment is smaller than the new object, then a new allocation segment
will be selected. This memory management with garbage collector is similar to the one
presented in [12].
4. CONCLUSION
A Java processor with hardware based Java Virtual Machine is presented in this paper.
This model takes the advantage of using a stack less Java processor and also reduces the access
time to constant pool index by using a Way predictive JLAB and this is well suited for embedded
applications because of its simple design and memory usage.
Beside the advantages, there are some places were more work can be done, the method
cache can be configured in such a way that very less number of misses occur and also for way
prediction buffer, the process of calculating the prediction can be more effective than the present
methods like using PC to index the prediction table and calculating the address by XOR between
register and offset.


REFERENCES
[1] Herbert Schildt, Java: The Complete Reference, 8th Edition. New York, USA: McGraw-
Hill, 7th ed., 2006.
[2] Yi-Yu Tan; Yau, C. H.; Lo, K. M.; Yu, W. S.; Pak-Lun Mok; Shi-Sheung Fong, A.,
"Design and implementation of a Java processor," Computers and Digital Techniques,
IEE Proceedings - , vol.153, no.1, pp.20,30, 10 Jan. 2006
[3] OConnor, J.M., and Tremblay, M.: PicoJava I: The Java virtual machine in hardware,
IEEE Micro, March 1997, 17, (2), pp. 4553
[4] Sun Microsystems: PicoJava-II: Java processor core. Sun Microsystems data sheet,
April 1998.
[5] aJile Systems, Inc. : aJ-100 real-time low power Java
TM
processor, aJ-100
TM
reference
manual. Version 2.1, December 2001
[6] Aurora VLSI Inc.: AU-J2000: super high performance Java processor core (data sheet).
Aurora VLSI Inc., 2000
[7] M. Watheq El-Kharashi, Fayez Elguibaly, and Kin F. Li. 2001. Adapting Tomasulo's
algorithm for bytecode folding based Java processors. SIGARCH Comput. Archit.
News 29, 5 (December 2001), 1-8
[8] El-Kharashi, M.W.; Gebali, F.; Li, K.F.; Fang Zhang, "The JAFARDD processor: a Java
architecture based on a Folding Algorithm, with Reservation stations, Dynamic
translation, and Dual processing," Consumer Electronics, IEEE Transactions on , vol.48,
no.4, pp.1004,1015, Nov 2002
[9] Martin Schoberl, JOP: A Java Optimized Processor for Embedded Real-Time Systems,
Ph.D. Thesis, Tech. Universitaet Wien, Jan 2005.
[10] N. Shimizu, M. Naito, Dual Issue Queued Pipelined Java Processor TRAJA, Toward
an Open Source Processor Project, Proceedings of the first IEEE Asia Pacific
Conference on ASICs, pp. 213216, 1999
[11] Naohiko Shimizu and Chiaki Kon. 2003. "Java object look aside buffer for embedded
applications". SIGARCH Comput. Archit. News 32, 3 (September 2003), 43-49.
[12] Zabel, M.; Preusser, T.B.; Reichel, P.; Spallek, R.G., "Secure, Real-Time and Multi-
Threaded General-Purpose Embedded Java Microarchitecture," Digital System Design
Architectures, Methods and Tools, 2007. DSD 2007. 10th Euromicro Conference on, vol.,
no., pp.59,62, 29-31 Aug. 2007
[13] Martin Schoeberl. "Time-predictable cache organization. ln Proceedings of the First
International Workshop on Software Technologies for Future Dependable Distributed
Systems," STFSSD 2009, Tokyo, Japan.
[14] T. Lindholm, F. Yellin The Java
TM
Vritual Machine Specification, Addison-Wesley,
1997

You might also like