On Fri, Jul 31, 2009 at 15:38, Jacob Rus <jacobolus@gmail.com> wrote:
Brett Cannon wrote:
Jacob Rus wrote:
* It defines __all__: I didn’t even realize __all__ could be used for single-file modules (w/o submodules), but it definitely shouldn’t be here.
__all__ is used to control what a module exports when used in an import *, nothing more. Thus it's use in a module compared to a package is completely legitimate.
This specific __all__ oddly does not include all of the documented variables and functions in the mimetypes class. It’s not clear why someone calling import * here wouldn’t want the bits not included.
If something is documented by not listed in __all__ that is a bug.
In this case, everything in the module is documented, including parts that should be private, but only a small number are in __all__. My recommendation would be to make those private parts be _ variables and remove them from the docs (using them has no legitimate use cases I can see), and rip out __all__.
Well, if the module had stuff that did not lead with an underscore then you can't remove it. You can deprecate it under the old name and rename it with an underscore, but backwards-compatibility says someone out there is using those functions so you can't just batch rename them w/o the proper warning.
* It creates a _default_mime_types() function which declares a bunch of global variables, and then immediately calls _default_mime_types() below the definition. There is literally no difference in result between this and just putting those variables at the top level of the file, so I have no idea why this function exists, except to make the code more confusing.
It could potentially be used for testing, but that's a guess.
Here's an abridged version of this function. I don’t think there’s any reason for this that I can see.
def _default_mime_types(): global suffix_map global encodings_map global types_map global common_types
suffix_map = { '.tgz': '.tar.gz', #... }
encodings_map = { '.gz': 'gzip', #... }
types_map = { '.a' : 'application/octet-stream', #... }
common_types = { '.jpg' : 'image/jpg', #... }
_default_mime_types()
As R. David pointed out, it is being used by regrtest to clean up after running the test suite.
Probably came from someone who is very OO happy. Not everyone comes to Python ready to embrace its procedural or slightly functional facets.
Yes, it seems so to me too.
So the problem of changing fundamentally how the code works, even for a cleanup, is that it will break someone's code out there because they depended on the module's crazy way of doing things. Now if they are cheating and looking at things that are meant to be hidden you might be able to clean things up, but if the semantics are exposed to the user, then there is not much we can do w/o breaking someone's code.
The problem is that the semantics as documented are really ambiguous, and what I would consider the reasonable interpretation is different from what the code actually does. So anyone using this code naively is going to run into trouble, and anyone relying on how the code actually works is going behind the back of the docs, but they sort of have to in order to use much of the functionality of the module. I agree this puts us in a tricky spot.
Well, perhaps the docs can be updated to match the code where cleanup would change the semantics.
Honestly, if the code is as bad as it seems -- including its API --, the best bet would be to come up with a new module for handling MIME types from scratch, put it up on the Cheeseshop/PyPI, and get the community behind it. If the community picks it up as the de-facto replacement for mimetypes and the code has settled we can then talk about adding it to the standard library and begin deprecating mimetypes. And thanks for willing to volunteer to fix this.
Okay. Well I'd still like to hear a bit about what people really need before trying to make a new API. I'm not such an experienced API designer, and I haven’t really plumbed the depths of mimetypes use cases (though it seems to me like quite a simple module of not more than 100 lines of code or so would suffice).
I'm sure you can get help from the community with any of this.
At the very least, I think some changes can be made to this code without altering its basic function, which would clean up the actual mime types it returns, comment the exceptions to Apache and explain why they're there, and make the code flow understandable to someone reading the code.
That all sounds reasonable. -Brett