Hash stability

Heiko Wundram modelnine at modelnine.org
Sun Jan 15 11:07:52 EST 2012


Am 15.01.2012 13:22, schrieb Peter Otten:
> Heiko Wundram wrote:
>> I agree completely with that (I hit the corresponding problem with suds
>> while transitioning from 32-bit Python to 64-bit Python, where hashes
>> aren't stable either), but as stated in my mail: that wasn't the
>> original question. ;-)
>
> I'm curious: did you actually get false cache hits or just slower responses?

It broke the application using suds, not due to false cache hits, but 
due to not getting a cache hit anymore at all.

Long story: to interpret WSDL-files, suds has to get all related DTDs 
for the WSDL file, and Microsoft (as I wrote I was querying Exchange Web 
Services) insists on using http://www.w3.org/2001/xml.dtd for the XML 
spec path. This path is sometimes functional as a GET URL, but mostly 
not (due to overload of the W3-servers), so basically I worked around 
the problem by creating an appropriate cache entry with the appropriate 
name based on hash() using a local copy of xml.dtd I had around. This 
took place on a development machine (32-bit), and when migrating the 
application to a production machine (64-bit), the cache file wasn't used 
anymore (due to the hash not being stable).

It's not that this came as a surprise (I quickly knew the "workaround" 
by simply rehashing on the target machine and moving the cache file 
appropriately), and I already said that this is mostly just a plain bad 
design decision on the part of the suds developers, but it's one of 
those cases where a non-stable hash() can break applications, and except 
if you know the internal workings of suds, this will seriously bite the 
developer.

I don't know the prevalence of suds, but I guess there's more people 
than me using it to query SOAP-services - all of those will be affected 
if the hash() output is changed. Additionally, if hash() isn't stable 
between runs (the randomized hash() solution which is preferred, and 
would also be my preference), suds caching becomes completely useless. 
And for the results, see above.

-- 
--- Heiko.



More information about the Python-list mailing list