SAX-Parser entity

fabi.kreutz at gmx.de fabi.kreutz at gmx.de
Fri Mar 1 12:20:10 EST 2002


Hi, Harvey!

Ahh, utf-16 sounds good.
Thanks, I have at least one solution:
Reading the xml-file into a buffer and convert it to utf-16.
minidom is then able to parse the whole thing and saves the strings in
unicode, which is fine again.

I do not understand the part with "If your parser supports...". As it seems
to me, the minidom default parser does not support ISO 8859/1 and even
unicode makes problems only.
I didn't know, you can reprogram the parser so easily.

Anyway, thanks a lot. Bye
	Fabian

Harvey Thomas <hst at empolis.co.uk> wrote:
> I would guess that your document is in ISO 8859/1 (otherwise known as
> latin-1). XML parsers must be able to parse utf-8 and utf-16 and may
> support other encodings. If your parser supports latin-1 then modify the
> XML declaration. Otherwise use the codecs module.

>> Problem:
>> I try to use the minidom XML-Parser to parse my little file 
>> in order to generate HTML Code.
>> Being german, I really like to use Umlauts but minidom does not.
>> ...
>> Traceback (most recent call last):
>> "/usr/lib/python2.0/site-packages/_xmlplus/sax/handler.py", 
>> line 38, in fatalError
>>     raise exception
>> xml.sax._exceptions.SAXParseException: <unknown>:29:19: not  well-formed
>> 
>> where Character 19 in Row 29 is the occurence of an ü.



More information about the Python-list mailing list