getattr/setattr still ASCII-only, not Unicode - blows up SGMLlibfrom BeautifulSoup
tjreedy at udel.edu
Thu Mar 13 22:30:50 CET 2008
"John Nagle" <nagle at animats.com> wrote in message
news:47d97288$0$36363$742ec2ed at news.sonic.net...
| Just noticed, again, that getattr/setattr are ASCII-only, and don't
| SGMLlib blows up because of this when faced with a Unicode end tag:
| File "/usr/local/lib/python2.5/sgmllib.py", line 353, in finish_endtag
| method = getattr(self, 'end_' + tag)
| UnicodeEncodeError: 'ascii' codec can't encode character u'\xae'
| in position 46: ordinal not in range(128)
| Should attributes be restricted to ASCII, or is this a bug?
Except for comments and string literals preceded by an encoding
Python code is ascii only:
" Python uses the 7-bit ASCII character set for program text."
ref manual 2. lexical analisis
This changes in 3.0
More information about the Python-list