Possible bug with stability of mimetypes.guess_* function output

Asaf Las roegltd at gmail.com
Fri Feb 7 20:09:19 CET 2014


On Friday, February 7, 2014 8:06:36 PM UTC+2, Johannes Bauer wrote:
> Hi group,
> 
> I'm using Python 3.3.2+ (default, Oct  9 2013, 14:50:09) [GCC 4.8.1] on
> linux and have found what is very peculiar behavior at best and a bug at
> worst. It regards the mimetypes module and in particular the
> guess_all_extensions and guess_extension functions.
> 
> I've found that these do not return stable output. When running the
> following commands, it returns one of:
> 
> $ python3 -c 'import mimetypes;
> print(mimetypes.guess_all_extensions("text/html"),
> mimetypes.guess_extension("text/html"))'
> ['.htm', '.html', '.shtml'] .htm
> 
> $ python3 -c 'import mimetypes;
> print(mimetypes.guess_all_extensions("text/html"),
> mimetypes.guess_extension("text/html"))'
> ['.html', '.htm', '.shtml'] .html
> 
> So guess_extension(x) seems to always return guess_all_extensions(x)[0].
> 
> Curiously, "shtml" is never the first element. The other two are mixed
> with a probability of around 50% which leads me to believe they're
> internally managed as a set and are therefore affected by the
> (relatively new) nondeterministic hashing function initialization.
> 
> 
> I don't know if stable output is guaranteed for these functions, but it
> sure would be nice. Messes up a whole bunch of things otherwise :-/
> 
> Please let me know if this is a bug or expected behavior.
> 
> Best regards,
> 
> Johannes

dictionary. same for v3.3.3 as well. 

it might be you could try to query using sequence below : 

import mimetypes
mimetypes.init()
mimetypes.guess_extension("text/html")

i got only 'htm' for 5 consequitive attempts

/Asaf



More information about the Python-list mailing list