Patch http://www.python.org/sf/554192 adds a function to mimetypes.py that returns all known extensions for a mimetype, e.g.
import mimetypes mimetypes.guess_all_extensions("image/jpeg") ['.jpg', '.jpe', '.jpeg']
Martin v. Loewis and I were discussing whether it would make sense to make the helper method add_type (which is used for adding a mapping between one type and one extension) visible on the module level. Any comments? Bye, Walter Dörwald
"WD" == Walter Dörwald
writes:
WD> Martin v. Loewis and I were discussing whether it would make WD> sense to make the helper method add_type (which is used for WD> adding a mapping between one type and one extension) visible WD> on the module level. WD> Any comments? +1 on add_types() being public, but it should probably have a strict flag to decide whether to add the new entry to the standard types dict or the common types dict. -Barry
Barry A. Warsaw wrote:
"WD" == Walter Dörwald
writes: WD> Martin v. Loewis and I were discussing whether it would make WD> sense to make the helper method add_type (which is used for WD> adding a mapping between one type and one extension) visible WD> on the module level.
WD> Any comments?
+1 on add_types() being public, but it should probably have a strict flag to decide whether to add the new entry to the standard types dict or the common types dict.
OK, so we probably need a reverse mapping for common_types too, but shouldn't we consider common_types to be fixed? Maybe we should add a guess_all_types too, so we can handle duplicate extensions, i.e.
mimetypes.guess_all_types(".cdf") ['application/x-cdf', 'application/x-netcdf']
This would of course require to change the initialization of types_map from a dict constant to many calls to add_type. Even better would be, if we could assign priorities to the mappings, so that for e.g. image/jpeg the preferred extension is .jpeg. Then guess_type() and guess_extension() would return the preferred mimetype/extension. Bye, Walter Dörwald
Walter Dörwald
OK, so we probably need a reverse mapping for common_types too, but shouldn't we consider common_types to be fixed?
If anything, types_map should be fixed: Those are the official IANA-supported types (including the official x- extension mechanism). The common types are those that violate IANA specs, yet found in real life. If you wanted to support strictness in add_type, then you would require that the type starts with x-; since mimetypes.py should have all registered types incorporated (if it misses some, that's a bug).
Even better would be, if we could assign priorities to the mappings, so that for e.g. image/jpeg the preferred extension is .jpeg. Then guess_type() and guess_extension() would return the preferred mimetype/extension.
Do you have a specific application for that in mind? It sounds like overkill. Regards, Martin
Martin v. Loewis wrote:
Walter Dörwald
writes: OK, so we probably need a reverse mapping for common_types too, but shouldn't we consider common_types to be fixed?
If anything, types_map should be fixed: Those are the official IANA-supported types (including the official x- extension mechanism).
The common types are those that violate IANA specs, yet found in real life.
If you wanted to support strictness in add_type, then you would require that the type starts with x-; since mimetypes.py should have all registered types incorporated (if it misses some, that's a bug).
OK, but then adding the entries to types_map must be done differently. I'd prefer if both can be done by add_type (but then we'd need tree modes: Initialising types_map, adding further mappings to types_map (checking that only x- types/subtypes are used, and adding mappings to common_types.
Even better would be, if we could assign priorities to the mappings, so that for e.g. image/jpeg the preferred extension is .jpeg. Then guess_type() and guess_extension() would return the preferred mimetype/extension.
Do you have a specific application for that in mind? It sounds like overkill.
I'm using a web mirror script which uses the extensions from guess_extension to save all downloaded resources, and I hate it when the HTML files are named .htm and JPEG images are named .jpe. Bye, Walter Dörwald
Walter Dörwald
Even better would be, if we could assign priorities to the mappings, so that for e.g. image/jpeg the preferred extension is .jpeg. Then guess_type() and guess_extension() would return the preferred mimetype/extension. Do you have a specific application for that in mind? It sounds like overkill.
I'm using a web mirror script which uses the extensions from guess_extension to save all downloaded resources, and I hate it when the HTML files are named .htm and JPEG images are named .jpe.
Then this is your preference - others might prefer jpg, just because their file system can deal better with that. If you can agree that this is your preference, you should put the preference mechanism into the application. Maybe your preference can be expressed algorithmically? It might be that you always want the longest known extension (it is unlikely that you prefer "jpeg" over "jpg" just because that contains a vowel :-). Regards, Martin
Martin v. Loewis wrote:
Walter Dörwald
writes: Even better would be, if we could assign priorities to the mappings, so that for e.g. image/jpeg the preferred extension is .jpeg. Then guess_type() and guess_extension() would return the preferred mimetype/extension.
Do you have a specific application for that in mind? It sounds like overkill.
I'm using a web mirror script which uses the extensions from guess_extension to save all downloaded resources, and I hate it when the HTML files are named .htm and JPEG images are named .jpe.
Then this is your preference - others might prefer jpg, just because their file system can deal better with that. If you can agree that this is your preference, you should put the preference mechanism into the application.
Agreed, other applications might have other priorities.
Maybe your preference can be expressed algorithmically? It might be that you always want the longest known extension (it is unlikely that you prefer "jpeg" over "jpg" just because that contains a vowel :-).
I guess it's "longest one" or "the one most unencumbered by filesystem limitations". OK, so lets drop the priority idea. What do we do with the patch as it is now? Bye, Walter Dörwald
participants (3)
-
barry@python.org
-
martin@v.loewis.de
-
Walter Dörwald