Ran across this article the other day where he hacks a copy of the java runtime classes to log data.
In his article, he is showing how to capture java classes which are loaded at runtime from byte buffers. Not being a Java guy, it took some experimenting but I finally got his code implemented and operational, its a pretty slick technique.
I wanted to find a way to instrumentate the java run time for a while so I wouldnt have to deal with trying to decode strings in Java exploits. I think I am going to go through and add logging code to a couple more methods of interest now to make analysis easier.
The fact that Java runs on top of an open source framework is pretty cool, with enough understanding you can control every ounce of its execution. In his article he says to get the source for your platform without any more details. That took some digging to understand. Turns out the jdk (java development kit) installs include a src.zip folder which contains the java sources that you can hack up and recompile. The main classes are in the rt.jar archive in the /jre/lib folder.
While you can edit the java files in your editor of choice, I would recommend just using the command line javac.exe to recompile them rather than trying to use the IDE.
In order to test your tweaks, you can extract rt.jar to a local directory and then use something like the following command line to run the classes under analysis with the instrumentated run time:
java -Xbootclasspath:c:/myruntime/ -jar test.jar
Other approaches I was considering was using the openJDK binaries and source package provided by Sun. (Note dont grab them from openJDK.net unless you want to compile the whole package on your own which comes with lots of requirements)
I was also considering looking into hacking up a copy of IKVM since I am more comfortable in .NET but it looked like its build requirements were more than i wanted to get into for an offhand experiment.
If your going to play, you can download sample code which loads a class from a byte buffer here.
Note that there may have been some changes to java since his article. I had to tweak his code a bit to get it to run, in particular change invoke(null, to invoke(regeneratedClass.newInstance(),
In other news, I also learned a neat java trick the other day. I had a java exploit which did a bunch of string decryption. The decryption functions had been modified so they would not decompile. Using IDA i was able to disassemble them to bytecode disassembly. The IDA java disassembly output, is the perfect format to feed the code back into the Jasmin java assembler.
I used this combination to create a compiled class file for the obsfuscated code and made the methods public in the process. I was then able to create a separate java project which used this decryption class to dump the strings. Much more involved than just instrumentating the run time but its a cool technique to have in your quiver anyway.
Comments: (2)
On 02.23.12 - 2:54pm Dave wrote:
if you are working in the runtime classes you will want to filter which calls you log so that you dont spam yourself with runtime loading stuff. the following code can help:
Example command line to compile a runtime class: D:\jdk8\bin>javac -g -J-Xmx512m -nowarn -cp "C:\Program Files\Java\jre1.8.0_
91\lib\rt.jar";"D:\jdk8\lib\tools.jar" D:\jdk8\src\java\lang\ClassLoader.java
going through the java learning curve while working on system classes is probably not all that recommended
but thats how i roll!
On 03.05.14 - 12:19am jay wrote:
For Java hacking, its actually far better to use a Java-Agent Its a Jar thats executed pre-main/applet start, intended for inspection/debugging/utility/metrics. However, it does allow for the possibility of MITMing the SystemClassLoader which then, you can instrument the code being passed for execution... With the redefine option you can morph the executing code at any time the only thing you cannot do is change the object hierarchy. (Add fields or methods.) You of course can add new Classs, and, morph the code in existing classes as needed.
Personally, I use java NIO and completely circumvent the default classloader... I pull the class-path locations, iterate jar-files, and create a new FileSystem on each jar. On loading the linker mechanism will call findClass which searches each jar via jar.getPath(/classname.class) which can then be asked Files.exists(path) if not, next jar.
Basically, in the end, I use ObjectWebs ASM to inspect the binary data of classes before theyre loaded... This inspection notes and loads core api classes before it attempts ClassLoader.defineClass(name, byte[] data, 0, data.length) During the inspection time the loading class is mapped into my Control Flow Graph (CFG)... API calls point to a node which does not get traced, while classes from the non-standard api are mapped in entirety. It does several, large scale optimizations of the runtime code during the static analysis. It maps integer calculations, replacing all values which can be pre-computed, accordingly, then tries to reorder the operations using PEMDAS for a more uniform view of the bytecode. During this pass it also does a cost assessment of field accesses finding any method that reads a field several times before the current control deviation could have possibly set it. It then replaces the getters with a single get and store into a local variable.
Why? Because java is great, its the most widely use programming language worldwide. And its secure, which actually becomes a burden in the run-time, raising the cost of accessing object data. Each time a member is loaded its security is checked, unless specific conditions are met (such as a field being final) to see if it is accessible from the current scope. Cumulative calls, though optimized are still more costly than loading from the local variable this goes double for arrays which are treated as Objects first and then arrays second. Plus, this gives a uniformity to the code so its so much easier to step into when Im doing my analysis. Dont forget obfuscaters do funny things, like bitshift-left by 1092341209 which to, is actually just 25, as the value used is ( 31) for integral values and ( 63) for long.
Plus, for string decryption, I dont bother producing decryption mechanism for each obfuscater case, instead I use a throwaway classloader where I change the name of the class to _str(name) and load it (as most use the static initializer to decode into an array)... I use reflection to grab the array, and remove all reference to it, and it itself.