Python for Reverse Engineering

Mike Meyer mwm at mired.org
Fri Nov 5 22:39:10 EST 2004


Brad Tilley <rtilley at vt.edu> writes:

> A friend of mine wrote an algorithm that generates strings. He says
> that it's impossible to figure out exactly how the algorithm
> works. That may be true, but I think if I had enough sample strings
> that I could write a program to identify patterns in the strings and
> then produce strings with similar patterns. He disagrees with me. He
> gave me 100 strings to analyze.
>
> I wrote a script to read each string and build a list of characters
> present. I've discovered that he only uses 20 alpha/numeric chars to
> generate the strings and that each string sums up to a value between
> 1000 and 1200 when I assign each char in the string its ASCII value.
>
> What else might I do to analyze these things? As it stands now, I can
> generate an acceptable string on ever 100th attempt or so, but I'd
> like to do better than this.

Build a markov chain generator. Basically, build tuples of characters
of length N, and use those to index a dictionary containing a list (or
string) of characters that follow that list of characters in your
input strings. So, with N equal 2, and d[('a', 'b')] = ['c', 'd', 'e']
you'd know that the strings abc, abd and abe appeared in the text. To
gennerate a string, pull a random key from the dictionary, then choose
random characters from the list that the last N characters treated as
a tuple index in the dictionary. Larger values of N generate strings
that more closely resemble the input text.

Optionally, make the objects stored in the dictionary keep a count of
how many times each character appears, and choose one randomly
weighted by those counts.

I've got a markov chain generator that works on words if you really
want to see one.

         <mike
-- 
Mike Meyer <mwm at mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.



More information about the Python-list mailing list