[XML-SIG] unicode problems in elementtree
David Stanek
dstanek at dstanek.com
Sat May 27 00:34:29 CEST 2006
On Fri, May 26, 2006 at 09:22:41PM +0100, Bryan Lawrence wrote:
>
> Does elementtree and/or expat need to know the encoding to get this right?
> (which may be a problem coz this could be from anyone's document in any
> encoding ...)
>
I think you will have to tell elementtree what encoding your XML is
in. Otherwise how would it know? I am sure there is a better way,
but I have seen people try to guess encodings like:
# untested and from my bad memory :-)
encodings = ['utf-8', 'utf-16',i 'iso-8859-1',]
for encoding in encodings:
try:
unicode(s, encoding)
except UnicodeError:
pass
else:
break
The encodings list would be a list of common encodings that you may
expect. Again there must be a better way to do this... I would
suggest that you try to set a standard for encodings.
David Stanek
--
http://www.traceback.org
GPG keyID #6272EDAF on http://pgp.mit.edu
Key fingerprint = 8BAA 7E11 8856 E148 6833 655A 92E2 3E00 6272 EDAF
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20060526/34a02057/attachment.pgp
More information about the XML-SIG
mailing list