sax barfs on unicode filenames
Diez B. Roggisch
deets at nospam.web.de
Wed Oct 4 10:23:52 EDT 2006
Edward K. Ream wrote:
> Hi. Presumably this is a easy question, but anyone who understands the
> sax docs thinks completely differently than I do :-)
>
>
>
> Following the usual cookbook examples, my app parses an open file as
> follows::
>
>
>
> parser = xml.sax.make_parser()
>
> parser.setFeature(xml.sax.handler.feature_external_ges,1)
>
> # Hopefully the content handler can figure out the encoding from the
> # <?xml>
> element.
>
> handler = saxContentHandler(c,inputFileName,silent)
>
> parser.setContentHandler(handler)
>
> parser.parse(theFile)
>
>
>
> Here 'theFile' is an open file. Usually this works just fine, but when
Filenames are expected to be bytestrings. So what happens is that the
unicode string you pass as filename gets implicitly converted using the
default encoding.
You have to encode the unicode string according to your filesystem
beforehand.
Diez
More information about the Python-list
mailing list