IMPLEMENTATION OF JVM TOOL INTERFACE ON DALVIK VIRTUAL MACHINE
CHAPTER 1
INTRODUCTION
1.1 Background
Android is a software platform and operating system for mobile devices based on the Linux operating system and developed by Google and the Open Handset Alliance. It allows developers to write managed code in a Java-like language that utilizes Google-developed Java libraries, but does not support programs developed in native code.
Android platform consists of a Linux operating system, core libraries of the Java programming language, an Java virtual machine (Dalvik virtual machine), and some key applications (e.g. browser, maps). Dalvik virtual machine (DVM) is a major component of Google’s Android platform. It is optimized for low memory requirements and is designed to allow multiple VM instances to run at the same time. The virtual machine runs Java applications. However, Dalvik virtual machine is different from standard Java virtual machine in some ways. First, most virtual machines use a stack-based architecture, but Dalvik is a register-based architecture. Second, Dalvik runs Java applications which have been transformed into the Dalvik Executable (.dex) format. These two major differences make Dalvik different from a standard Java virtual machine. Therefore, tools developed on standard Java virtual machines, such as profiling tools, cannot be ported to DVM directly without modification
we discuss the development of a profiling tool interface, JVM TI, on Android. With this tool interface, developers can profile their Java code running on Dalvik using JVM TI
1.2 Objective
Java virtual machine tool interface (JVM TI) is a native programming interface on Java virtual machine. The interface provides functionalities to inspect the state of a virtual machine, gather information during run time, and also control the execution of applications running on the Java virtual machine. JVM TI has been used by many profiling tools such as HProf , YAJP, Jboss, etc. But JVM TI is not available in all implementations of the Java virtual machine. Dalvik is one of the virtual machines that do not support JVM TI.
In this work, we discuss implementation of JVM TI on Dalvik virtual machine. With the tool interface, programmers can easily develop their own tools or use available profiling tools based on JVM TI to profile Dalvik virtual machine.
1.3 Organization of the report
The topic is divided into six chapters. In the first chapter, the necessity of the subject and objective expected is described. In the second chapter, structure of JVM, its operating principle and working is described. The third chapter describes JVMTI and operation. The fourth chapter deals with the Dalvic virtual machine and it’s Architecture. In the fifth chapter describes implementation of JVM TI interface in DVM. The report is summarized and concluded in chapter six.
CHAPTER 2
STUDY OF JAVA VIRTUAL MECHINE
JVM is a platform-independent execution environment that converts Java bytecode into machine language and executes it. Most programming languages compile source code directly into machine code that is designed to run on a specific microprocessor architecture or operating system, such as Windows or UNIX. A JVM -- a machine within a machine -- mimics a real Java processor, enabling Java bytecode to be executed as actions or operating system calls on any processor regardless of the operating system. For example, establishing a socket connection from a workstation to a remote machine involves an operating system call. Since different operating systems handle sockets in different ways, the JVM translates the programming code so that the two machines that may be on different platforms are able to connect.
2.1 Architecture of java virtual machine

Programs intended to run on a JVM must be compiled into Java bytecode, a standardized portable binary format which typically comes in the form of .class files (Java class files). A program may consist of many classes in different files. For easier distribution of large programs, multiple class files may be packaged together in a .jar file (short for Java archive).The Java application launcher,
java
, offers a standard way of executing Java code. Compare java
.The JVM runtime executes .class or .jar files, emulating the JVM instruction set by interpreting it or using a just-in-time compiler (JIT) such as Oracle's HotSpot. JIT compiling, not interpreting, is used in most JVMs today to achieve greater speed. There are also ahead-of-time compilers that enable developers to precompiled class files into native code for particular platforms.
Like most virtual machines, the Java virtual machine has a stack-based architecture akin to a microcontroller/microprocessor. However, the JVM also has low-level support for Java-like classes and methods, which amounts to a highly idiosyncratic memory model and capability-based architecture.
Chapter 3
JAVA VIRTUAL MECHINE TOOL INTERFACE
The JVMTM Tool Interface (JVM TI) is a programming interface used by development and monitoring tools. It provides both a way to inspect the state and to control the execution of applications running in the Java virtual machine (VM). JVM TI is intended to provide a VM interface for the full breadth of tools that need access to VM state, including but not limited to: profiling, debugging, monitoring, thread analysis, and coverage analysis tools.
JVM TI may not be available in all implementations of the Java virtual machine.JVM TI is a two-way interface. A client of JVM TI, hereafter called an agent, can be notified of interesting occurrences through events. JVM TI can query and control the application through many functions, either in response to events or independent of them.
Agents run in the same process with and communicate directly with the virtual machine executing the application being examined. This communication is through a native interface (JVM TI). The native in-process interface allows maximal control with minimal intrusion on the part of a tool. Typically, agents are relatively compact. They can be controlled by a separate process which implements the bulk of a tool's function without interfering with the target application's normal execution.


Tools can be written directly to JVM TI or indirectly through higher level interfaces. The Java Platform Debugger Architecture includes JVM TI, but also contains higher-level, out-of-process debugger interfaces. The higher-level interfaces are more appropriate than JVM TI for many tools. For more information on the Java Platform Debugger Architecture, see the Java Platform Debugger Architecture website.
· Writing Agents
Agents can be written in any native language that supports C language calling conventions and C or C++ definitions.
The function, event, data type, and constant definitions needed for using JVM TI are defined in the include file jvmti.h. To use these definitions add the J2SETM include directory to your include path and add #include <jvmti.h> to your source code.
· Deploying Agents
An agent is deployed in a platform specific manner but is typically the platform equivalent of a dynamic library. On the Windows operating system, for example, an agent library is a "Dynamic Linked Library" (DLL). On the Solaris Operating Environment, an agent library is a shared object (.so file).
An agent may be started at VM startup by specifying the agent library name using a command line option. Some implementations may support a mechanism to start agents in the live phase. The details of how this is initiated are implementation specific.
CHAPTER 4
DALVIC VIRTUAL MECHINE
The Dalvik virtual machine is implemented by Google for the Android OS, and functions as the Interpreter for Java code running on Android devices. It is a process virtual machine, where by the underlying Linux kernel of the Android OS spawns a new Dalvik VM instance for every process. Each process in Android has its own Dalvik VM instance. This reduces the chances of multi-application failure if one Dalvik VM crashes. Dalvik implements the register machine model, and unlike standard Java bytecode (which executes 8 bit stack instructions on the stack based JVM), uses a 16 bit instruction set.The registers are implemented in Dalvik as 4 bit fields.If we want to dive a bit deep into the internals of how each process gets an instance of the Dalvik VM, we have to go back to the beginning… back to where the Linux kernel of the Android OS boots up:

When the system boots up, the boot loader loads the kernel into memory and initializes system parameters. Soon after this,
- The kernel runs the Init program, which is the parent process for all processes in the system.
- The Init program starts system deamons and the very important ‘Zygote’ service.
- The Zygote process creates a Dalvik instance which will be the parent Dalvik process for all Dalvik VM instances in the system.
- The Zygote process also sets up a BSD read socket and listens for incoming requests.
- When a new request for a Dalvik VM instance is received, the Zygote process forks the parent Dalvik VM process and sends the child process to the requesting application.
This is in essence, how the Dalvik virtual machine is created and used in the Android system.
Coming back to the topic of virtual machines, Dalvik differs from the Java virtual machine in that it executes Dalvik byte code, and not the traditional Java byte code. There is an intermediary step between the Java compiler and the Dalvik VM, that converts the Java byte code to Dalvik byte code, and this step is taken up by the DEX compiler
CHAPTER 5
IMPLEMENTATION OF JVM TOOL INTERFACE ON DALVIK VIRTUAL MACHINE
5.1 Design Philosophy
JVM TI is not supported on original Dalvik virtual machine, so we need to design a new subsystem on Dalvik for JVM TI. In this paper, we propose a well implemented JVM TI on Dalvik. Since Dalvik virtual machine is a small, simple, and memory efficiency virtual machine. We follow some basic principle to design our Java virtual machine tool interface on Dalvik virtual machine.
• Low overhead: As Dalvik is a simple virtual machine, the tool interface runs on it should not cause too much overhead.
• Memory efficiency: Since Dalvik targets on mobile devices which has memory constraint, JVM TI should not take too much memory space
• Easy to upgrade: With the upgrade of Android, our interface should be easy to port on latest version. As a result, we gather our code independent of Dalvik source code.
5.2 Architecture of JVM TI IN DVM
Our implementation of JVM TI consists of two parts, “virtual machine events and handler” and “tool interface provided functions”. Figure 1 shows the architecture of our implementation. We add two subsystems into Dalvik, event handler and implementation of each function. The grey blocks are each subsystem source code of Dalvik. For example, block “Thread” means the implementation of Java thread in Dalvik, including structure of Java thread, and functions to maintain Java thread such as creation, stop etc. The white block of each grey block means what we add or modify to the subsystem, including event and additional functions or structure to support JVM TI.

5.3 JVM TI Events Implementation
We separate event handler for every different event instead of a single handler for every event. This lowers the overhead of event handler, because we need not to switch from different event. And more importantly, variables need to pass to every callbacks are quite different, thus this design is more suitable for this requirement. This design is also convenient to user apply our JVM TI on newer version of Dalvik virtual machine when Android upgrades. Users simply instrument the separate event handler function call to the location we define, and get the same result.
5.4 Challenge and Contribution of Implementation
· Class File Load Hook, Byte-code Instrumentation
Class file load hook event is triggered when the virtual machine obtains a class file data, but before it constructs the in-memory representation for that class. At this event Dalvik permit agents to modify the original class file, which is called “load-time byte-code instrumentation”. [13]
There are some problems when we implement this event and do load-time byte-code instrumentation. First, the way Dalvik load a class is different from standard virtual machine. Dalvik loads a single dex file into a read-only memory for a Java application at initialization of Dalvik. When a class loads into Dalvik, the different data will point to the different location of dex file.
To solve this problem, we modify the way Dalvik loads methods. We make a copy of every method when it is loaded if agent registers class file load hook event. And makes the class point to original location of these methods, and then call the callbacks of this event to agent. An agent can modify the method because we’ve copied methods to writeable space. After returning from the callback, we check whether the method is modified or not. If agent did not modify the methods, we free the copies to get back memory space. If agent modifies the methods, we make this class point to the copies. This way causes some memory overhead.
This method also has it limitation. Dalvik is a register based virtual machine, when class files transform into single dex file, dex tool sets number of registers to every method. The number of registers to every method will be kept in dex file and used as verification. The verification will check dex file directly, and we cannot modify it. So if the number of registers of byte-code to be instrumented greater than number of registers set to method by dex tool, the instrumentation cannot be done. This causes the limitation of byte-code instrumentation in Dalvik. A better solution here needs to modify the dex tool to reserve the registers for byte-code instrumentation.
Second, Dalvik executes different file format from standard “class” file, and Dalvik runs its own byte-code. This means we cannot use the library that Sun provides to do byte-code instrumentation. We solve this problem as follow: We compile the tracker class (the method call to be injected to select class) with user program, and use “dexdump” tool to find out the position of the tracker class to know the byte-codes of method call to the tracker class. The position of tracker class would be different from applications through the transform of dex tool. This means the byte-codes for the method calls differs from application to application. As a result, we need to modify the method for different applications now.
• Agent Data Management
Agents can tag data to target object or thread as a tracker. The original structure to present object and thread of Dalvik do not reserve the space to keep information like this, so we need to modify the original structure of thread and object. We reserved four bytes as pointer to agent data and another four bytes to reserve additional data that agents need to keep in structure of thread and object.
• Object Monitor Information
Another structure of Dalvik we modify is object monitor. The Java object monitors of Dalvik are implemented by pthread. Thus, the monitor structure is simple and do not keep information that JVM TI support. Such as, threads that waiting to own this object are unknown, and threads that waiting to be notified by this monitor are also not keep in Dalvik.
We modify the object monitor structure of Dalvik. We add two lists to maintain the n threads that waiting to own the object and waiting to be notified by the monitor and then calculate the number of threads. We also need to modify the functions that provide monitor functionality of Dalvik to keep the information of monitor accurate. In order to lower the overhead of overall execution time, the information we add will be kept if agents registered the monitor events.
• Event Callbacks Scheduling
JVM TI agents need to synchronize the execution of event callbacks, because callbacks always need to modify the global data. JVM TI provides raw monitor to synchronize the execution of event callbacks. The Java thread in Dalvik is implemented by pthread, so as agent thread. So we create a structure to maintain the raw monitor and implement it by pthread library. Since raw monitor needs not to provide any information to agent as object monitor does. We simplify the raw monitor structure, and use functions that thread provides to implement the functionality such as notify, wait etc.
CHAPTER 6
CONCLUSION
Through good profiling methods, developers of embedded systems can reduce the efforts to identify the design and development problems of the systems, and optimize the design. A general purpose tool interface helps developers achieve this goal.
With our implementation of JVM TI on Dalvik, profiling tools for Java based on JVM TI can be used on Dalvik. Furthermore, with the events and functions that JVM TI supports, users can develop their own tool on Dalvik.
REFERENCES
3. [3]Java Virtual Machine Tool Interface Spec: http://java.sun.com/javase/6/docs/platform/jvmti/jvmti.html
Hello Firoz,
ReplyDeleteIs there an open source implementation of this work?
Vicky