Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Virtualizing Java in Java

How?
Alexey Ragozin

jug.ru
12 Dec 2013
Virtualizing Java in Java
Single JVM hosting
•
•
•
•
•

multiple “pseudo JVMs”, which
Have independent system properties
Have independent statics
May have different classpath
May be forcefully terminated or
suspended/resumed

You can deploy whole distributed application
topology inside of JUnit test
All started with Oracle Coherence
Native Java distributed peer-to-peer key/value storage
With a lot of extension points
 Distributed data processing
 Listeners for remote events
 Read-through / write-through patterns
 Pluggable serialization
And inability to start two cluster members in one JVM
And we need to test
Behavior of very specific features
Serialization/deserialization defects
Data routing and collocation aspects
Code meant for distributed execution
Threading aspects
Classpath differences between cluster
processes
Cluster configuration tweaks
Multiplexing singletons
Custom classloader
Force to reload classes already loaded by
parent (second copy of class loaded)
Got our first cluster-in-JVM alive
 InheritentThreadLocal to add fancy stuff
– Multiplexing system properties
– Multiplexing multiplex console output
Heated competition
Three open source “cluster virtualizing” effort around
Oracle Coherence

 GridKit (ChTest)
code.google.com/p/gridkit
 Oracle tools
github.com/coherence-community/oracle-tools
 Little
www.littlegrid.net
“Distributing” test case
How to start The Application on “virtual cluster”?

Old school
 Main classes and command line arguments
 But if you need to do verification inside of
vinode?
 A separate main for each test case?
But how all this is relevant to me?
Normal client/server application
 You can use your real main classes instead of mocking server

Hadoop / HBase / Cassandra
 Distributed
 Deployment unfriendly
 Ship with single node – cut down versions
“Distributing” test case
Transparent remotting
client1.exec(new Runnable() {
@Override
public void run() {
NamedCache cache = CacheFactory.getCache(cacheName);
Assert.assertNull(cache.get("A"));
cache.put("A", "aaa");
}
});
client2.exec(new Runnable() {
@Override
public void run() {
NamedCache cache = CacheFactory.getCache(cacheName);
Assert.assertEquals("aaa", cache.get("A"));
}
});
We want to test more!
We are in control class loading
— let’s tweak classpath on flight
 Inject resources
 Remove server classes from client
 Test different codebase versions
Backward compatibility testing

Master JVM, client regression test pack
[trunk version]
Case

Client
[version X]

Server
[trunk version]
Cross version tests

Master JVM, client regression test pack
[trunk version]
Case
Client
[version X]

Client
[version Y]

Server
[version Z]
Managing artifacts
… a bunch of black magic to find local repo
and managing classpath as easy as …
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>2.8</version>
<executions>
<execution>
<id>viconcurrent-0.7.15</id>
<phase>test-compile</phase>
<goals>
<goal>get</goal>
</goals>
<configuration>
<artifact>org.gridkit.lab:viconcurrent:0.7.15</artifact>
</configuration>
</execution>
</executions>
</plugin>
Managing artifacts
How to get needed artifact on local disk
- Maven will disallow two versions of same artifact
- but we can trick it …
ViNode node;
…
node.x(MAVEN).replace("org.gridkit.lab", "viconcurrent", "0.7.15");

Transitive dependencies are not included, though.
Transparent remotting
Own implementation of RMI
 Standard Java serialization
 Serializing of anonymous Runnable/Callable
 Auto export of Remote interfaces

Writing distributed code by convention!
Bidirectional communications
public interface RemotePut extends Remote {
public void put(Object key, Object value);
}

public interface RemotePut extends Remote {

@SuppressWarnings("unused")
@Test
public void bidirectional_remoting() {
// Present for typical single node cluster
cloud.all().presetFastLocalCluster();

public void put(Object key, Object value);

cloud.node("storage.**").localStorage(true);
cloud.node("client.**").localStorage(false);

}

// Simulates DefaultCacheServer based process
cloud.node("storage.**").autoStartServices();
// declaring specific nodes to be created
CohNode storage = cloud.node("storage.1");
CohNode client1 = cloud.node("client.1");
CohNode client2 = cloud.node("client.2");

RemotePut remoteService =
client1.exec(new Callable<RemotePut>() {
@Override
public RemotePut call() {
final NamedCache cache = CacheFactory.getCache(cacheName);

// now we have 3 specific nodes in cloud
// all of then will be initialized in parallel
cloud.all().ensureCluster();
final String cacheName = "distr-a";

return new RemotePut() {
@Override
public void put(Object key, Object value) {
cache.put(key, value);
}
};

RemotePut remoteService =
client1.exec(new Callable<RemotePut>() {
@Override
public RemotePut call() {
final NamedCache cache = CacheFactory.getCache(cacheName);
return new RemotePut() {
@Override
public void put(Object key, Object value) {
cache.put(key, value);
}
};

}
});
}
});

remoteService.put("A", "aaa");

remoteService.put("A", "aaa");

client2.exec(new Runnable() {
@Override
public void run() {
NamedCache cache = CacheFactory.getCache(cacheName);
Assert.assertEquals("aaa", cache.get("A"));
}
});
}
Bidirectional communications
Extending java.rmi.Remotexec
will mark interface for auto export
public interface RemotePut extends Remote {
public void put(Object key, Object value);
}

RemotePut remoteService =
client1.exec(new Callable<RemotePut>() {
@Override
public RemotePut call() {
final NamedCache cache = CacheFactory.getCache(cacheName);
return new RemotePut() {
@Override
public void put(Object key, Object value) {
cache.put(key, value);
}
};

}
});

remoteService.put("A", "aaa");

Here we got a remote stub, not a
real implementation of interface

Unlike Java RMI, there is no need to
declare RemoteException for
every method
Result of callable will be serialized
and transferred to caller
Objects implementing remote
interfaces are automatically replaced
with remote stub during serialization
Call to a stub, will be converted to
“remote” call to instance we have
created in “virtualized” node few
lines above
Sneak peek: Instrumentation
 System.exit() – is still fatal
 Some cases need “virtual time”
 Tweaking monolithic code
 Fault injection
 Mock injection
Sneak peek: Instrumentation
PowerMock
 Recompiles everything (Coherence ~ 5000 classes)

AspectJ
 Static interceptors
ByteMan
 Using agent + weird language
Sneak peek: Instrumentation
ViNode node = ...
ViHookBuilder
.newCallSiteHook()
.onTypes(System.class)
.onMethod("exit")
.doReturn(null)
.apply(node);
node.exec(new Callable<Void>() {
@Override
public Void call() throws Exception {
System.exit(0);
return null;
}
});
LET’S TAKE A BREAK HERE

Questions?
From “virtual” to real cluster

Alexey Ragozin

jug.ru
12 Dec 2013
Virtual stuff is so good
Managing virtual nodes in deterministic way, in
Java, having all luxury of exception handling and
richness of libraries – feeling were so good …
I wished, I could rollout JVMs across real servers
Your network is Big JVM
Same API
3 types of nodes: in-process, local, remote
Transparent remotting
SSH to manage remote server
Automatic classpath replication (with caching)
Zero infrastructure
 Any OS for master host
 SSHd + JVM for slave hosts
New opportunities
Performance testing





deploy system under test
deploy load generators
deploy monitoring agents
gather all result in one place

Deployment (remote execution task for ANT)
Replace your putty with Java IDE
 log scrapping
 parallel execution
As easy as …
@Test
public void remote_hello_world() throws InterruptedException
{
ViManager cloud = CloudFactory.createSimpleSshCloud();
cloud.node("myserver.uk.db.com");
cloud.node("**").exec(new Callable<Void>() {
@Override
public Void call() throws Exception
{
String localHost = InetAddress.getLocalHost().toString();
System.out.println("Hi! I'm running on " + localHost);
return null;
}
});
}
Behind the scene
•
•
•
•

JSCh – SSH client (slightly patched)
Collected classpath artifacts
SCP jars to remote target
Start remote agent (JVM)
 stdOut / stdIn for master – agent communications

• Agent start slave process
• Slave start RMI node and connects (TCP) to master
 agent acts as TCP proxy
Death clock is ticking
Master JVM kills slave processes, unless
 SSH session was interrupted
 someone kill -9 master JVM
 master JVM has crashed (e.g. under debuger)
Death clock is ticking on slave though
 if master is not responding
 slave process will terminate itself
Performance testing
•
•
•
•
•

Master JVM is running on CI
JUnit to start test
Loaded generator farm is deployed on test servers
Monitoring agents deployed on application cluster
Metrics are buffered locally, then send to master and processed

Real test
• Four servers – application
• 50 servers – load farm
• Over 200 – slaves processes
Coding for 200+ processes
Driver - concept
• Driver – Java interface encapsulates test
action
• One way methods
• Friendly for remotting for parallel invokation
+ some utility for parallel execution, workflow
etc
Example:
https://gridkit.googlecode.com/svn/grid-lab/trunk/examples/zk-benchmark-sample
Monitoring
Sigar - http://www.hyperic.com/products/sigar
• collecting common system metrics
Attach API – built-n JVM monitoring
SJK – spin-off CLI tool for JVM diagnostics
• https://github.com/aragozin/jvm-tools
Links
NanoCloud
• https://code.google.com/p/gridkit/wiki/NanoCloudTutorial
• Maven Central: org.gridkit.lab:telecontrol-ssh:0.7.22
• http://blog.ragozin.info/2013/01/remote-code-execution-in-java-made.html

ANT task
• https://github.com/gridkit/gridant

ChTest (Coherence test tool)
• https://code.google.com/p/gridkit/wiki/ChTest
• Maven Central: org.gridkit.coherence-tools:chtest:0.2.4
Thank you
http://blog.ragozin.info
- my articles
http://code.google.com/p/gridkit
http://github.com/gridkit
- my open source code
http://aragozin.timepad.ru
- tech meetups in Moscow

Alexey Ragozin
alexey.ragozin@gmail.com

More Related Content

Virtualizing Java in Java (jug.ru)

  • 1. Virtualizing Java in Java How? Alexey Ragozin jug.ru 12 Dec 2013
  • 2. Virtualizing Java in Java Single JVM hosting • • • • • multiple “pseudo JVMs”, which Have independent system properties Have independent statics May have different classpath May be forcefully terminated or suspended/resumed You can deploy whole distributed application topology inside of JUnit test
  • 3. All started with Oracle Coherence Native Java distributed peer-to-peer key/value storage With a lot of extension points  Distributed data processing  Listeners for remote events  Read-through / write-through patterns  Pluggable serialization And inability to start two cluster members in one JVM
  • 4. And we need to test Behavior of very specific features Serialization/deserialization defects Data routing and collocation aspects Code meant for distributed execution Threading aspects Classpath differences between cluster processes Cluster configuration tweaks
  • 5. Multiplexing singletons Custom classloader Force to reload classes already loaded by parent (second copy of class loaded) Got our first cluster-in-JVM alive  InheritentThreadLocal to add fancy stuff – Multiplexing system properties – Multiplexing multiplex console output
  • 6. Heated competition Three open source “cluster virtualizing” effort around Oracle Coherence  GridKit (ChTest) code.google.com/p/gridkit  Oracle tools github.com/coherence-community/oracle-tools  Little www.littlegrid.net
  • 7. “Distributing” test case How to start The Application on “virtual cluster”? Old school  Main classes and command line arguments  But if you need to do verification inside of vinode?  A separate main for each test case?
  • 8. But how all this is relevant to me? Normal client/server application  You can use your real main classes instead of mocking server Hadoop / HBase / Cassandra  Distributed  Deployment unfriendly  Ship with single node – cut down versions
  • 9. “Distributing” test case Transparent remotting client1.exec(new Runnable() { @Override public void run() { NamedCache cache = CacheFactory.getCache(cacheName); Assert.assertNull(cache.get("A")); cache.put("A", "aaa"); } }); client2.exec(new Runnable() { @Override public void run() { NamedCache cache = CacheFactory.getCache(cacheName); Assert.assertEquals("aaa", cache.get("A")); } });
  • 10. We want to test more! We are in control class loading — let’s tweak classpath on flight  Inject resources  Remove server classes from client  Test different codebase versions
  • 11. Backward compatibility testing Master JVM, client regression test pack [trunk version] Case Client [version X] Server [trunk version]
  • 12. Cross version tests Master JVM, client regression test pack [trunk version] Case Client [version X] Client [version Y] Server [version Z]
  • 13. Managing artifacts … a bunch of black magic to find local repo and managing classpath as easy as … <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-dependency-plugin</artifactId> <version>2.8</version> <executions> <execution> <id>viconcurrent-0.7.15</id> <phase>test-compile</phase> <goals> <goal>get</goal> </goals> <configuration> <artifact>org.gridkit.lab:viconcurrent:0.7.15</artifact> </configuration> </execution> </executions> </plugin>
  • 14. Managing artifacts How to get needed artifact on local disk - Maven will disallow two versions of same artifact - but we can trick it … ViNode node; … node.x(MAVEN).replace("org.gridkit.lab", "viconcurrent", "0.7.15"); Transitive dependencies are not included, though.
  • 15. Transparent remotting Own implementation of RMI  Standard Java serialization  Serializing of anonymous Runnable/Callable  Auto export of Remote interfaces Writing distributed code by convention!
  • 16. Bidirectional communications public interface RemotePut extends Remote { public void put(Object key, Object value); } public interface RemotePut extends Remote { @SuppressWarnings("unused") @Test public void bidirectional_remoting() { // Present for typical single node cluster cloud.all().presetFastLocalCluster(); public void put(Object key, Object value); cloud.node("storage.**").localStorage(true); cloud.node("client.**").localStorage(false); } // Simulates DefaultCacheServer based process cloud.node("storage.**").autoStartServices(); // declaring specific nodes to be created CohNode storage = cloud.node("storage.1"); CohNode client1 = cloud.node("client.1"); CohNode client2 = cloud.node("client.2"); RemotePut remoteService = client1.exec(new Callable<RemotePut>() { @Override public RemotePut call() { final NamedCache cache = CacheFactory.getCache(cacheName); // now we have 3 specific nodes in cloud // all of then will be initialized in parallel cloud.all().ensureCluster(); final String cacheName = "distr-a"; return new RemotePut() { @Override public void put(Object key, Object value) { cache.put(key, value); } }; RemotePut remoteService = client1.exec(new Callable<RemotePut>() { @Override public RemotePut call() { final NamedCache cache = CacheFactory.getCache(cacheName); return new RemotePut() { @Override public void put(Object key, Object value) { cache.put(key, value); } }; } }); } }); remoteService.put("A", "aaa"); remoteService.put("A", "aaa"); client2.exec(new Runnable() { @Override public void run() { NamedCache cache = CacheFactory.getCache(cacheName); Assert.assertEquals("aaa", cache.get("A")); } }); }
  • 17. Bidirectional communications Extending java.rmi.Remotexec will mark interface for auto export public interface RemotePut extends Remote { public void put(Object key, Object value); } RemotePut remoteService = client1.exec(new Callable<RemotePut>() { @Override public RemotePut call() { final NamedCache cache = CacheFactory.getCache(cacheName); return new RemotePut() { @Override public void put(Object key, Object value) { cache.put(key, value); } }; } }); remoteService.put("A", "aaa"); Here we got a remote stub, not a real implementation of interface Unlike Java RMI, there is no need to declare RemoteException for every method Result of callable will be serialized and transferred to caller Objects implementing remote interfaces are automatically replaced with remote stub during serialization Call to a stub, will be converted to “remote” call to instance we have created in “virtualized” node few lines above
  • 18. Sneak peek: Instrumentation  System.exit() – is still fatal  Some cases need “virtual time”  Tweaking monolithic code  Fault injection  Mock injection
  • 19. Sneak peek: Instrumentation PowerMock  Recompiles everything (Coherence ~ 5000 classes) AspectJ  Static interceptors ByteMan  Using agent + weird language
  • 20. Sneak peek: Instrumentation ViNode node = ... ViHookBuilder .newCallSiteHook() .onTypes(System.class) .onMethod("exit") .doReturn(null) .apply(node); node.exec(new Callable<Void>() { @Override public Void call() throws Exception { System.exit(0); return null; } });
  • 21. LET’S TAKE A BREAK HERE Questions?
  • 22. From “virtual” to real cluster Alexey Ragozin jug.ru 12 Dec 2013
  • 23. Virtual stuff is so good Managing virtual nodes in deterministic way, in Java, having all luxury of exception handling and richness of libraries – feeling were so good … I wished, I could rollout JVMs across real servers
  • 24. Your network is Big JVM Same API 3 types of nodes: in-process, local, remote Transparent remotting SSH to manage remote server Automatic classpath replication (with caching) Zero infrastructure  Any OS for master host  SSHd + JVM for slave hosts
  • 25. New opportunities Performance testing     deploy system under test deploy load generators deploy monitoring agents gather all result in one place Deployment (remote execution task for ANT) Replace your putty with Java IDE  log scrapping  parallel execution
  • 26. As easy as … @Test public void remote_hello_world() throws InterruptedException { ViManager cloud = CloudFactory.createSimpleSshCloud(); cloud.node("myserver.uk.db.com"); cloud.node("**").exec(new Callable<Void>() { @Override public Void call() throws Exception { String localHost = InetAddress.getLocalHost().toString(); System.out.println("Hi! I'm running on " + localHost); return null; } }); }
  • 27. Behind the scene • • • • JSCh – SSH client (slightly patched) Collected classpath artifacts SCP jars to remote target Start remote agent (JVM)  stdOut / stdIn for master – agent communications • Agent start slave process • Slave start RMI node and connects (TCP) to master  agent acts as TCP proxy
  • 28. Death clock is ticking Master JVM kills slave processes, unless  SSH session was interrupted  someone kill -9 master JVM  master JVM has crashed (e.g. under debuger) Death clock is ticking on slave though  if master is not responding  slave process will terminate itself
  • 29. Performance testing • • • • • Master JVM is running on CI JUnit to start test Loaded generator farm is deployed on test servers Monitoring agents deployed on application cluster Metrics are buffered locally, then send to master and processed Real test • Four servers – application • 50 servers – load farm • Over 200 – slaves processes
  • 30. Coding for 200+ processes Driver - concept • Driver – Java interface encapsulates test action • One way methods • Friendly for remotting for parallel invokation + some utility for parallel execution, workflow etc Example: https://gridkit.googlecode.com/svn/grid-lab/trunk/examples/zk-benchmark-sample
  • 31. Monitoring Sigar - http://www.hyperic.com/products/sigar • collecting common system metrics Attach API – built-n JVM monitoring SJK – spin-off CLI tool for JVM diagnostics • https://github.com/aragozin/jvm-tools
  • 32. Links NanoCloud • https://code.google.com/p/gridkit/wiki/NanoCloudTutorial • Maven Central: org.gridkit.lab:telecontrol-ssh:0.7.22 • http://blog.ragozin.info/2013/01/remote-code-execution-in-java-made.html ANT task • https://github.com/gridkit/gridant ChTest (Coherence test tool) • https://code.google.com/p/gridkit/wiki/ChTest • Maven Central: org.gridkit.coherence-tools:chtest:0.2.4
  • 33. Thank you http://blog.ragozin.info - my articles http://code.google.com/p/gridkit http://github.com/gridkit - my open source code http://aragozin.timepad.ru - tech meetups in Moscow Alexey Ragozin alexey.ragozin@gmail.com