SAX: `raw XML'
Martin v. Löwis
martin at v.loewis.de
Tue May 6 14:18:13 EDT 2003
jhefferon at smcvt.edu (Jim Hefferon) writes:
> I'm writing a routine to parse XML using SAX. I'm worried about the kind
> of strings that the SAX parser is giving me. In my startElement
> the documentation describes the parameter `name' as a "raw XML 1.0 string".
I'm not quite sure what the author of this text meant when he wrote
"raw XML 1.0 string".
> Does that mean that to process it (in the Right Way) I should use a Python
> unicode string?
Whatever the documentation means to say: It is a Unicode object,
representing the element name.
> Should I say something like
>
> def startElement(self,name,attrs):
> if ((self._state==u"start")
> and (name==u"FirstName")):
> self._state="FirstName"
That would be correct, yes. However, you can also compare with "FirstName"
(unless somebody has messed up the system encoding), since comparing
a Unicode string with "FirstName" first converts "start" to u"FirstName".
You could also consider assigning name directly to self._state, or you
could consider invoking name.encode("ascii"), to get a byte string.
Regards,
Martin
More information about the Python-list
mailing list