[Patches] [ python-Patches-1608805 ] Py_FileSystemDefaultEncoding can be non-canonical

SourceForge.net noreply at sourceforge.net
Mon Dec 4 23:06:35 CET 2006


Patches item #1608805, was opened at 2006-12-04 17:06
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1608805&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core (C code)
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Stephan R.A. Deibel (sdeibel)
Assigned to: Nobody/Anonymous (nobody)
Summary: Py_FileSystemDefaultEncoding can be non-canonical

Initial Comment:
On Linux/Unix it is possible for Py_FileSystemDefaultEncoding to be set to a non-canonical  encoding such as "UTF-8" instead of "utf-8".  This happens when it is set from codeset in Py_InitializeEx() in pythonrun.c.

This becomes a problem when this value is propagated through to PyUnicode_Decode() or PyUnicode_AsEncodedString() in unicodeobject.c.  One possible such code path starts in os.listdir() via PyUnicode_FromEncodedObject()).

In that case, the common case optimizations fail.  I noticed this in a case where the PyCodec_Decode() used instead was failing.  Normally I think this just amounts to broken optimization but given the likelihood of other such code being added in the future, I feel it's best to fix Py_FileSystemDefaultEncoding to always be a canonical form.

One possible way to fix it is attached as a patch.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1608805&group_id=5470


More information about the Patches mailing list