HTMLParser can't read japanese
nagle at animats.com
Tue Apr 13 14:51:07 EDT 2010
Yes. Try "cmd /u" to get a Unicode console.
HTMLparser should already have converted from Shift-JIS
to Unicode, so the "print" is outputting Unicode.
Stefan Behnel wrote:
> Dodo, 13.04.2010 13:40:
>> Here's a small script to generate again the error
>> running windows 7 with python 3.1
>> FILE : parseShift.py
>> import urllib.request as url
>> from html.parser import HTMLParser
>> class myParser(HTMLParser):
>> def handle_starttag(self, tag, attrs):
>> print("Start of %s tag : %s" % (tag, attrs))
> You problem is the last line. Your terminal does not support printing
> the text, so you get an exception here.
> Either change your terminal encoding to a suitable encoding, or write
> the text to an encoded file instead (see the 'encoding' option of the
> open() function for that).
More information about the Python-list