[Python-Dev] Fix Unicode-disabled build of Python 2.7

Jim J. Jewett jimjjewett at gmail.com
Tue Jun 24 23:03:27 CEST 2014




On 6/24/2014 4:22 AM, Serhiy Storchaka wrote:
> I submitted a number of patches which fixes currently broken
> Unicode-disabled build of Python 2.7 (built with --disable-unicode
> configure option). I suppose this was broken in 2.7 when C
> implementation of the io module was introduced.

It has frequently been broken.  Without a buildbot, it will continue
to break.  I have given at least a quick look at all your proposed
changes; most are fixes to test code, such as skip decorators.

People checked in tests without the right guards because it did work
on their own builds, and on all stable buildbots.  That will probably
continue to happen unless/until a --disable-unicode buildbot is added.

It would be good to fix the tests (and actual library issues).
Unfortunately, some of the specifically proposed changes (such as
defining and using _unicode instead of unicode within python code)
look to me as though they would trigger problems in the normal build
(where the unicode object *does* exist, but would no longer be used).
Other changes, such as the use of \x escapes, appear correct, but make
the tests harder to read -- and might end up removing a test for
correct unicode funtionality across different spellings.

Even if we assume that the tests are fine, and I'm just an idiot who
misread them, the fact that there is any confusion means that these
particular changes may be tricky enough to be for a bad tradeoff for 2.7.

It *might* work if you could make a more focused change.  For example,
instead of leaving the 'unicode' name unbound, provide an object that
simply returns false for isinstance and raises a UnicodeError for any
other method call.  Even *this* might be too aggressive to 2.7, but the
fact that it would only appear in the --disable-unicode builds, and
would make them more similar to the regular build are points in its
favor.

Before doing that, though, please document what the --disable-unicode
mode is actually *supposed* to do when interacting with byte-streams
that a standard defines as UTF-8.  (For example, are the changes to
_xml_dumps and _xml_loads at
    http://bugs.python.org/file35758/multiprocessing.patch
correct, or do those functions assume they get bytes as input, or
should the functions raise an exception any time they are called?)


-jJ

--

If there are still threading problems with my replies, please
email me with details, so that I can try to resolve them.  -jJ



More information about the Python-Dev mailing list