[Python-3000] basestring removal, __file__ and co_filename
Christian Heimes
lists at cheimes.de
Thu Oct 11 17:07:59 CEST 2007
Hello Python!
I've written a patch that removes basestring from py3k:
http://bugs.python.org/issue1258 During the testing of the patch I hit a
problem with __file__ and codeobject.co_filename. Both __file__ and
co_filename are byte strings and not unicode which is causing some
trouble. Guido asked me to provide another patch which decodes the
string using the default filesystem encoding.
Most of the patch was straight forward and easy but I hit one spot
that's causing some trouble. It's a chicken and egg issue.
codeobject.co_filename is a PyString instance. I like to perform
filename = PyString_AsDecodedObject(filename,
Py_FileSystemDefaultEncoding ? Py_FileSystemDefaultEncoding : "UTF-8",
NULL);
in order to decode the string with either the fs encoding or UTF-8 but
it's not possible. It's way too early in the bootstrapping process of
Python and the codecs aren't registered yet. In fact large parts of the
codecs package is implemented in Python ...
Ideas?
I could check if Py_FilesystemDefaultEncoding is one of the encodings
that are implemented in Python (UTF-8, 16, 32, latin1, mbcs) but what if
the fs default encoding is some obscure encoding?
Christian
More information about the Python-3000
mailing list