Furia-chan ~~~~~~~~ * Synopsis ~~~~~~~~~~ Furia-chan is a "Google" for binary programs. It can be used to streamline the process of detecting Libre/Open Source software license violations. * Details ~~~~~~~~~ Furia-chan works in a way analogous to a web information retrieval program (Google, Yahoo). We take a binary program, cut it into pieces that we call fragments (for information retrieval programs these are words). In reality these fragments are trees of machine instructions generated from data flow analysis. In the same way Google and Yahoo use stemming to "normalize" words to their roots, we use a tree distance function to be able to match slightly modified trees. By using this function we can normalize a program to its "base fragments". Once the program is normalized, we use standard information retrieval scoring techniques to match our binary programs, just as if we were matching a web page. I will add more details here, in the meantime you can read two papers that talk about Furia-chan. I have also included the presentation slides given at the respective conferences: *-----------*---------------------------*--------------------------------------------* SCAM 2007 | {{{paperSCAM2007.pdf}Paper}} | {{{presentationSCAM2007.pdf}Presentation}} *-----------*---------------------------*--------------------------------------------* Cascon 2006 | {{{cascon2006.pdf}Paper}} | {{{presentationCascon2006.pdf}Presentation}} *-----------*---------------------------*--------------------------------------------* [img099.jpg]