
Hi, I looked whether I could make the test suite pass again when compiled with --disable-unicode. One problem is that no Unicode escapes can be used since compiling the file raises ValueErrors for them. Such strings would have to be produced using unichr(). Is this the right way? Or is disabling Unicode not supported any more? Reinhold -- Mail address is perfectly valid!

Reinhold Birkenfeld wrote:
You mean, in Unicode literals? There are various approaches, depending on context: - you could encode the literals as UTF-8, and decode it when the module/test case is imported. See test_support.TESTFN_UNICODE for an example. - you could use unichr - you could use eval, see test_re for an example
Is this the right way? Or is disabling Unicode not supported any more?
There are certainly tests that cannot be executed when Unicode is not available. It would be good if such tests get skipped instead of being failing, and it would be good if all tests that do not require Unicode support run even when Unicode support is missing. Whether "it is supported" is a tricky question: your message indicates that, right now, it is *not* supported (or else you wouldn't have noticed a problem). Whether we think it should be supported depends on who "we" is, as with all these minor features: some think it is a waste of time, some think it should be supported if reasonably possible, and some think this a conditio sine qua non. It certainly isn't a release-critical feature. Regards, Martin

Martin v. Löwis wrote:
Okay. I can fix this, but several library modules must be fixed too (mostly simple fixes), e.g. pickletools, gettext, doctest or encodings.
That's my approach too.
Well, the core builds without Unicode, and any code that doesn't use unicode should run fine too. But the tests fail at the moment.
Correct. I'll see if I have the time. Reinhold -- Mail address is perfectly valid!

Reinhold Birkenfeld wrote:
Is the added complexity needed to support not having Unicode support compiled into Python really worth it ? I know that Martin introduced this feature a long time ago, so he will have had a reason for it. Today, I think the situation has changed: computers have more memory, are faster and the need to integrate (e.g. via XML) is stronger than ever - and maybe we should consider removing the option to get a cleaner code base with fewer #ifdefs and SyntaxErrors from the standard lib. What do you think ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 30 2005)
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

M.-A. Lemburg wrote:
Is the added complexity needed to support not having Unicode support compiled into Python really worth it ?
If there are volunteers willing to maintain it, and the other volunteers are not affected: certainly.
I know that Martin introduced this feature a long time ago, so he will have had a reason for it.
I added it because users requested it. I personally never use it.
-0 for just ripping it out. +0 if PEP 5 is followed, atleast in spirit (i.e. give users advance warning to let them protest). I guess users in embedded builds (either in embedded systems, or embedding Python into some other application) might still be interested in the feature. Of course, these users could either recreate the feature if we remove it, or just stay with Python 2.4. Regards, Martin

Martin v. Löwis wrote:
No objections there. I only see that --disable-unicode has already been broken a couple of times in the past and no-one (except those running test suites regularly) really noticed - at least not AFAIK.
If embedded build users rely on it, I'd suggest that these users take over maintenance of the patch set. Let's add a note to the configure switch that the feature will be removed in 2.6 and see what happens. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 30 2005)
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

Reinhold Birkenfeld <reinhold-birkenfeld-nospam@wolke7.net> writes:
Yeah, I've bumped into this.
Is this the right way? Or is disabling Unicode not supported any more?
I don't know. More particularly, I don't know if anyone actually uses a unicode-disabled build. If noone does, we might as well rip the code out. Cheers, mwh -- Sufficiently advanced political correctness is indistinguishable from irony. -- Erik Naggum
participants (4)
-
"Martin v. Löwis"
-
M.-A. Lemburg
-
Michael Hudson
-
Reinhold Birkenfeld