[Tutor] CGI problem.

Kent Johnson kent_johnson at skillsoft.com
Sun Nov 7 21:41:49 CET 2004

At 09:31 PM 11/7/2004 +0200, Mark Kels wrote:
>On Sun, 07 Nov 2004 12:14:45 -0500, Kent Johnson
><kent_johnson at skillsoft.com> wrote:
> > Can you post an example of a clear-text password and the hash you expect to
> > see from it?
> > How are you generating the hashes you are comparing against?
>The password is 124 (as a string), and the hash is
>\xc8\xff\xe9\xa5\x87\xb1&\xf1R\xed=\x89\xa1F\xb4E .
>I generated the hash that is in the file with IDLE (using the md5 module).

Aahhhhh...the light goes on....
You did something like this in IDLE:
 >>> import md5
 >>> p='124'
 >>> md5.new(p).digest()

Then you copied \xc8\xff\xe9\xa5\x87\xb1&\xf1R\xed=\x89\xa1F\xb4E and 
pasted it into a file. Later you read it back from the file and compare 
with the hash you get in the CGI and they don't match.

The problem is that the string you are pasting into the file is the repr() 
of the string - it is the way the string would look if you put it into 
source code in a Python program. The characters in the string with values > 
127 are represented with \x escapes, not as the actual character.

Whenever you work in the interactive interpreter, results can be printed in 
two different ways. If you let the interpreter print for you, it prints 
repr(value). If you explicitly use print, you get str(value). These are 
often similar, but not always. I think of repr() as the programmer's view 
of the data, and str() as the user's view.

For example, even with a simple string there is a difference:
 >>> s='a'
 >>> s
 >>> print s
 >>> s == repr(s)

Note that the first one has quotes - 'a' - while the second one is just an 
a. A small difference, but they are different. More important, for your 
problem, is that s is not equal to its repr(). How about this:
 >>> s='ü'
 >>> s
 >>> print s
 >>> s == repr(s)

(That's a u-umlaut, just in case it gets mangled by mail.) The first output 
looks nothing like the second. repr(s) uses \x escapes to print out the 
value in ASCII; print s prints the actual character.

OK, back to hashes...with your sample data, we have
 >>> d = md5.new(p).digest()
 >>> d
 >>> print d
+ TÑç¦&±Rf=ëíF¦E

Once again, the results are very different. This shows the actual hex 
values in the hash:
 >>> print ' '.join( [ hex(ord(c)) for c in d ] )
0xc8 0xff 0xe9 0xa5 0x87 0xb1 0x26 0xf1 0x52 0xed 0x3d 0x89 0xa1 0x46 0xb4 0x45

You can see that in repr(d) the characters less than 0x80 are shown 
verbatim, while characters >= 0x80 are escaped.

So, what to do? I can think of two solutions:
- In your CGI, when you compute the hash, compare repr(hash) against what 
you find in the file.
- Create the file with the actual hash, instead of it's repr(). If there is 
only one password in the file, you could do this from the command line, 
just open the file and write the string to it. If you have more than one 
password, you might want to write a small program to help with this.


More information about the Tutor mailing list