[Python-Dev] Minidom and Unicode

M.-A. Lemburg mal@lemburg.com
Mon, 03 Jul 2000 17:35:18 +0200

Fredrik Lundh wrote:
> martin wrote:
> > >
> > > $ export LANG=posix.utf8
> > [...]
> > > (or to put it another way, I'm not sure the repr/str fix is
> > > the real culprit here...)
> >
> > I think it is. My understanding is that repr always returns something
> > printable - if possible even something that can be passed to eval. I'd
> > certainly expect that a minidom Node can be printed always, no matter
> > what the default encoding is.
> >
> > Consequently, I'd prefer if the conversion uses some fixed, repr-style
> > encoding, eg. unicode-escape (just as repr of a unicode object does).
> oh, you're right.  repr should of course use unicode-escape, not
> the default encoding.  my fault.
> I'll update the repository soonish.

I'd rather have some more discussion about this... 

IMHO, all auto-conversions should use the default encoding. The
main point here is not to confuse the user with even more magic
happening under the hood.

If the programmer knows that he'll have to deal with Unicode
then he should make sure that the proper encoding is used
and document it that way, e.g. use unicode-escape for Minidom's
__repr__ methods.

BTW, any takers for __unicode__ to complement __str__ ?

> > If it is deemed unacceptable to put this into the interpreter proper,
> > I'd prefer if minidom is changed to allow representation of all Nodes
> > on all systems.
> the reason for this patch was to avoid forcing everyone to deal with
> this in their own code, by providing some kind of fallback behaviour.

That's what your patch does; I don't see a reason to change it :-)

Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/