HTMLParser can't read japanese

John Nagle nagle at
Tue Apr 13 14:51:07 EDT 2010

    Yes.  Try "cmd /u" to get a Unicode console.

    HTMLparser should already have converted from Shift-JIS
to Unicode, so the "print" is outputting Unicode.

				John Nagle

Stefan Behnel wrote:
> Dodo, 13.04.2010 13:40:
>> Here's a small script to generate again the error
>> running windows 7 with python 3.1
>> FILE :
>> import urllib.request as url
>> from html.parser import HTMLParser
>> class myParser(HTMLParser):
>>   def handle_starttag(self, tag, attrs):
>>     print("Start of %s tag : %s" % (tag, attrs))
> You problem is the last line. Your terminal does not support printing 
> the text, so you get an exception here.
> Either change your terminal encoding to a suitable encoding, or write 
> the text to an encoded file instead (see the 'encoding' option of the 
> open() function for that).
> Stefan

More information about the Python-list mailing list