Author: David Zimmer
Date: 11.17.15 - 1:10am
Here is a little drop of awesome to brighten your day.
So I was asked to write a decryptor for this application. All of the strings internally are held in encrypted format, and there is a lot of them. I located the decryptor and figured out the arguments and how it's utilized, but there are 308 sub functions within/below it. ( it uses std::string which makes a giant mess)
I played around for a couple hours playing with variables of the decryption function I thought that it was trying to replicate it but with no results. Finally I decided it would be easier to just utilize the malware's own decryptor function as a library. I've had a couple posts on doing this before usually by extracting it, or mapping it into memory and calling it that way. But this one is just ridiculously complex and it needs all of the code to be initialized already for it to work. Okay so what do you do in a situation like this?
Well since I need the executable to be fully initialized and ready, I'm pretty much going to have to run it with in the executable itself. First I tried modifying the executable to load as a DLL, but in this case that was not possible. So it has to run as an executable.
Okay, so I modified its import table to automatically load my own DLL. Then I went into win main and patched it so that after the executable had been initialized then it would call one of my exports in my DLL. (I could not use DllMain for this since the exe wouldnt be initialized yet)
So now in my DLL, I CreateWindow and start a message pump. My winproc is an IPC server listening for WM_COPYDATA messages. (My favorite IPC technique)
So basically I have turned the virus, into a decryption server. Now I can send the server messages from an external app, it will decrypt them, and send me back a response. All using the viruses own code at my whim. A basic template for this technique is available here.
Between analysis, experimentation, and then implementation this took a day's labor, but that is quite a deal compared to the complexity that I am harnessing and how long it would have taken to work it out any other way.
This is a pretty hard-core example Of the lengths we have to go sometimes to get the data we need. The sad thing is this is still only a fraction of what I need to accomplish. This just helps me analyze it and see what's there. Painful painful stepping stones :-
Another take on this..similar scenario, but now dealing with a RWE injection, no exe to mod and use, must hijack decoder in existing process. make an injection dll with a public export.
I should also note to compile the lookup list was kind of complex, it was not a simple push const push const call decoder, we had to scan the disasm back from each decoder xref to find the pushs, as other instructions were interleaved and sometimes push reg. Otherwise I could have utilized iDbg so set eip various places step over grab eax etc anyway..in this case..
so we inject the dll, then we look at the module list, find our new one, view names, goto our export, set origin there, set break on end, now we have a file output of hash/api start address map.
now we need to takes those raw addresses and get the api name for them. We generate a binary file from the addresses, and now inject that into the target process as data using ApiLogger. Then we goto that alloc view as long address, and we can extract the api address to name map. Now we can recombin the data and make an ida enum of the values. Yes..its a royal pita, this scenario is much simpler but more manual steps, code worked first try.
In hindsight, I could have simplified several steps and done a virtalalloc in the dll, and written the return results directly to memory, for then olly look/name extraction. Next time around. While the IPC approach was technically cool, this one is probably a good goto going forward. Unless you need to operate on arbitrary data then the IPC is required. Many ways to skin a cat.
Another technique I have developed to help with tasks such as this is a remote symbol resolution tool. This tool can do bulk api address lookups from a file or even over the network. It also has a scripting class support so it can be utilized directly from IDA scripts (python or my IDAJScript). Its pretty slick to be able to run a script in IDA to calculate API addresses, then have IDA query the live symbol resolution over the network from a process running on a remote machine! It is a bit crazy that this was a required capability but thats the world I live in!.
More discussion on decoders can be found here. I also have an article on reusing a malwares decoder by using an exe as a dll.