Python for Reverse Engineering
mwm at mired.org
Sat Nov 6 04:39:10 CET 2004
Brad Tilley <rtilley at vt.edu> writes:
> A friend of mine wrote an algorithm that generates strings. He says
> that it's impossible to figure out exactly how the algorithm
> works. That may be true, but I think if I had enough sample strings
> that I could write a program to identify patterns in the strings and
> then produce strings with similar patterns. He disagrees with me. He
> gave me 100 strings to analyze.
> I wrote a script to read each string and build a list of characters
> present. I've discovered that he only uses 20 alpha/numeric chars to
> generate the strings and that each string sums up to a value between
> 1000 and 1200 when I assign each char in the string its ASCII value.
> What else might I do to analyze these things? As it stands now, I can
> generate an acceptable string on ever 100th attempt or so, but I'd
> like to do better than this.
Build a markov chain generator. Basically, build tuples of characters
of length N, and use those to index a dictionary containing a list (or
string) of characters that follow that list of characters in your
input strings. So, with N equal 2, and d[('a', 'b')] = ['c', 'd', 'e']
you'd know that the strings abc, abd and abe appeared in the text. To
gennerate a string, pull a random key from the dictionary, then choose
random characters from the list that the last N characters treated as
a tuple index in the dictionary. Larger values of N generate strings
that more closely resemble the input text.
Optionally, make the objects stored in the dictionary keep a count of
how many times each character appears, and choose one randomly
weighted by those counts.
I've got a markov chain generator that works on words if you really
want to see one.
Mike Meyer <mwm at mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
More information about the Python-list