Skip navigation.
Home

Extracting malicious code from MSOffice .docs

|

Hello,

Not sure if this is the right place for this post, but I was wondering if someone could point me in the right direction. I am trying to learn how to reverse different types of Files/Code/Malware.. I have a copy of an MS Office .doc file with exploit code in it(attempting to exploit a vuln). I have never dealt with malicious .doc files or and MS Office files. I was wondering what the best way to go about extracting/viewing the shell code/or code in the .doc file? I have tried google'ing for how to's or tips with no luck. I've open the doc file with hex viewer and ida pro but nothing really stands out. Any tips, links, scripts suggestions would be greatly appreciated.

thanks for the help in advance.

Disassembling the .doc

There are a few methods I use to detect/extract shell code in arbitrary binaries that all involve disassembling the .doc and looking for valid instructions.

You can load the file in IDA and press 'c' around bytes that might be valid instructions. This takes a while and is my least favorite method.

You can write a script for Immunity Debugger in Python that loads the contents of the. doc into memory and then scans for hex bytes that correspond to jmp or call instructions. It would look at the destination operand (location of the jmp or call) and determine if the address is valid (within range of the loaded .doc). If so, you can print the offset and then go inspect the surrounding instructions to see if its shell code or just a false positive. This is my second favorite method - discussed in Defcon presentation:

http://mhl-malware-scripts.googlecode.com/files/Defcon2008_MalwareRCE_Ligh_Sinclair.pdf

My favorite method is to use distorm64 by Gil Dabah:

http://www.ragestorm.net/distorm/

There is a sample VC++ project in there that accepts a file on command line and outputs the disassembled instructions. I grep for "CALL" and then filter results that call invalid memory addresses. It only takes a few seconds to widdle down and find the correct area. distorm64 prints the offset of the file for each instruction, so just find the start and end and extract with a hex editor.

Great, thanks a lot for the

Great, thanks a lot for the wide range of tips given that will really help me get started!!

Thx again