SAX-Parser entity
fabi.kreutz at gmx.de
fabi.kreutz at gmx.de
Fri Mar 1 12:20:10 EST 2002
Hi, Harvey!
Ahh, utf-16 sounds good.
Thanks, I have at least one solution:
Reading the xml-file into a buffer and convert it to utf-16.
minidom is then able to parse the whole thing and saves the strings in
unicode, which is fine again.
I do not understand the part with "If your parser supports...". As it seems
to me, the minidom default parser does not support ISO 8859/1 and even
unicode makes problems only.
I didn't know, you can reprogram the parser so easily.
Anyway, thanks a lot. Bye
Fabian
Harvey Thomas <hst at empolis.co.uk> wrote:
> I would guess that your document is in ISO 8859/1 (otherwise known as
> latin-1). XML parsers must be able to parse utf-8 and utf-16 and may
> support other encodings. If your parser supports latin-1 then modify the
> XML declaration. Otherwise use the codecs module.
>> Problem:
>> I try to use the minidom XML-Parser to parse my little file
>> in order to generate HTML Code.
>> Being german, I really like to use Umlauts but minidom does not.
>> ...
>> Traceback (most recent call last):
>> "/usr/lib/python2.0/site-packages/_xmlplus/sax/handler.py",
>> line 38, in fatalError
>> raise exception
>> xml.sax._exceptions.SAXParseException: <unknown>:29:19: not well-formed
>>
>> where Character 19 in Row 29 is the occurence of an ü.
More information about the Python-list
mailing list