As Chamuco has indicated, I'm working on getting my prototype code into a usable form. For those of you who did not get the chance to see my office, I had about 8 pages filled front-and-back with file offset calculations and other side-effects of a highly disjoint process.
Right now I've moved the code from a series of standalone projects into a suite unified by a CGI/python interface. This is moving toward integration with OC's systems to provide automatic coverage of malware submissions.
The major problem with the BLAST-type approach, as with the original BLAST algorithm, is in filtering the output to get the usable kernels. Part of the motivation for my talk at DefCon is to start to identify various `styles' of code. Discriminating among various compilers, obfuscation, object formats help not only to classify code but also to weed out all of the extra results.