[XML-SIG] PyExpat encoding (was: XML support in Python 1.6)
Lars Marius Garshol
larsga@garshol.priv.no
02 Jun 2000 12:22:39 +0200
Here is my take on this:
- the entire XML data model is based on Unicode and we should just
accept that rather than try to work against it
- since Python 1.6 supports Unicode directly we should exploit that;
especially since mixing ordinary and Unicode strings seems to be
painless (in other words: the fact that you get Unicode strings
should be more or less invisible to you unless you actively care)
- I can't imagine why anyone would want ordinary strings with UTF-8
encoded text in them; but if someone can come up with a convincing
use case we should support that as well
Conclusion:
- if Python version is lower than 1.6, we should just do what we do
today: return UTF-8 encoded normal strings
- if not, return Unicode objects
- I have no problems with adding a run-time configuration option to
expat that allows users to say 'parser.set_return_unicode(0)'.
- there should probably also be a 'parser.get_return_unicode()' so
that applications can check what is going on
The real question is of course who will do the actual work of adding
this... :-)
--Lars M.