2.3 encoding parsing bug

Jeff Epler jepler at unpythonic.net
Wed Feb 18 23:35:03 CET 2004


Edward,

It's unfortunate that you didn't contribute to the discussion
of PEP 263, which was created in June 2001[1], mentioned on
comp.lang.python.announce/python-announce at python.org as early as August
2001[2], discussed on comp.lang.python/python-list at python.org back in
February 2002[3], available as a patch in March 2002[4], and present
in the Python CVS around August 2002[5].  Alpha releases of Python
(including binary releases for Windows) with the feature were available
on December 31, 2002[6].  Leo, on the other hand, added support for its
own encoding cookie on January 21, 2002[7].  The fatal (for LEO) dot
in the regular expression was added on February 28, 2002[8].  I didn't
find a thread that explains why this was done, but I believe it was to
support encodings like 'japanese.sjis'[9]

Since dotted encodings reflect a namespace hierarchy, ones with trailing
dots are nonsense.  It seems to me that the easiest fix for this problem
would be to ignore a trailing dot, if it is present in the encoding
cookie.  I'm at least +1/2 on that idea.

References:
[1] http://www.python.org/peps/pep-0263.html
[2] http://groups.google.com/groups?selm=mailman.996828301.19910.clpa-moderators%40python.org
[3] http://groups.google.com/groups?selm=mailman.1014614501.21492.python-list%40python.org
[4] http://python.org/sf/526840 "Date Submitted"
[5] http://python.org/sf/534304 "Date Closed"
[6] http://www.python.org/2.3/NEWS.txt "What's New in Python 2.3 alpha 1?"
[7] http://cvs.sourceforge.net/viewcvs.py/leo/leo/Attic/leoAtFile.py?r1=1.106&r2=1.107
    "Line 540"
[8] http://cvs.sourceforge.net/viewcvs.py/python/python/nondist/peps/pep-0263.txt?r1=1.7&r2=1.8
[9] http://dist.shot.cx/SnapShot/PEP263/pep0263-2.2.1c2-03/sjis_sample.py

Jeff




More information about the Python-list mailing list