HTMLParser can't read japanese
John Nagle
nagle at animats.com
Tue Apr 13 14:51:07 EDT 2010
Yes. Try "cmd /u" to get a Unicode console.
HTMLparser should already have converted from Shift-JIS
to Unicode, so the "print" is outputting Unicode.
John Nagle
Stefan Behnel wrote:
> Dodo, 13.04.2010 13:40:
>> Here's a small script to generate again the error
>> running windows 7 with python 3.1
>>
>> FILE : parseShift.py
>>
>> import urllib.request as url
>> from html.parser import HTMLParser
>>
>> class myParser(HTMLParser):
>> def handle_starttag(self, tag, attrs):
>> print("Start of %s tag : %s" % (tag, attrs))
>
> You problem is the last line. Your terminal does not support printing
> the text, so you get an exception here.
>
> Either change your terminal encoding to a suitable encoding, or write
> the text to an encoded file instead (see the 'encoding' option of the
> open() function for that).
>
> Stefan
>
More information about the Python-list
mailing list