[Patches] [ python-Patches-445762 ] Support --disable-unicode

noreply@sourceforge.net noreply@sourceforge.net
Tue, 31 Jul 2001 00:48:10 -0700


Patches item #445762, was opened at 2001-07-29 14:13
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=445762&group_id=5470

Category: Build
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Martin v. Löwis (loewis)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Support --disable-unicode

Initial Comment:
This patch implements the option --disable-unicode.
In particular, it:
- does not compile unicodeobject, unicodectype, 
_codecsmodule, and unicodedata if Unicode is disabled
- checks for Py_Unicode in all places that use 
Unicode functions
- disables unicode literals, the builtin functions, 
and the string encode and decode methods,
- avoids Unicode literals in a few places in the 
libraries
- adds the types.StringTypes list

Most of the test suite passes with these changes. A 
number of tests fail, mostly because they use Unicode 
literals.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2001-07-31 00:48

Message:
Logged In: YES 
user_id=21627

The new version of the patch implements all features that 
have been discussed.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-30 07:39

Message:
Logged In: YES 
user_id=38388

Ok, I see your point about the API references.

About the PyString_Encode/Decode: on platforms without Unicode, the encoding should not have a default, so 
passing NULL as encoding should result in an error. I am not even sure, whether it should have a default on 
Unicode builds... probably not.

Trimming down the _codecmodule.c to register and lookup is OK; there are a few codecs in 2.2 which don't
use Unicode at all.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-07-30 07:30

Message:
Logged In: YES 
user_id=21627

This patch already makes use of the assumption that
PyUnicode_Check will always return 0. In all the remaining
cases, the code will also call some function of the Unicode
module, which will result in a compile time error since the
functions are not declared anymore. Even if it was declared,
it would probably result in a linker error since not all
compilers will remove the entire code block. Only in cases
where the if-block does not call any Unicode functions
directly, that approach can be used.

I can try to re-enable the _codecs module, although only
register and lookup would remain.

I cannot re-enable PyString_Decode/Encode, since they use 
PyUnicode_GetDefaultEncoding, which is not available since
unicodeobject.c is not compiled.

I will try to have the tokenizer generate more specific
error messages.

Support for "es", "et" is still there; they only work for
strings, though, and they never call any codecs.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-30 06:06

Message:
Logged In: YES 
user_id=38388

Nice work, Martin !

Some comments:
- I think that we could save some of the #ifdefs by simply assuming that an optimizing will not generate code for "if 
(0)" == "if (PyUnicode_Check(obj))"; this would make the code more readable
- the _codecmodule.c should not be disabled by the configure option... codecs are useful for non-Unicode 
applications as well
- the PyString_Encode/Decode() APIs should not be disabled for the same reason
- the tokenizer/compiler should generate errors with an explicit message stating that the Python version was 
compiled without Unicode support
- dito for the Unicode parser markers (I think that open() on Windows will fail without "es"... ?)


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=445762&group_id=5470