cElementTree encoding woes
Diez B. Roggisch
deets at nospam.web.de
Mon Feb 20 06:52:44 EST 2006
> Both my python2.3 and python2.4 interpreters seem to know "Windows-1252":
>
>>>> import codecs
>>>> codecs.open("windows.xml", encoding="windows-1252")
> <open file 'windows.xml', mode 'rb' at 0x403737e0>
>
> Maybe the problem lies in the python installation rather than
> cElementTree? Just guessing, though.
Hm. No idea why I was under the impression they weren't there - but still,
it doesn't work: I get
inf = file(sys.argv[1])
#inf = codecs.StreamRecoder(inf,encoder, decoder, reader, writer)
for event, elem in cElementTree.iterparse(inf):
pass
pukes on me with
Traceback (most recent call last):
File "./splitter.py", line 31, in ?
for event, elem in cElementTree.iterparse(inf):
File "<string>", line 61, in __iter__
SyntaxError: not well-formed (invalid token): line 35, column 34
That is the first french character encountered.
"""<title>Introduction aux Probabilités</title>"""
So - then the problem is not the codec being ignored, but it simply is not
working.
Regards,
Diez
More information about the Python-list
mailing list